Returning Awkward Arrays vs returning NumPy arrays from Awkward functions #532
Replies: 7 comments
-
In other words, I don't understand why awkward arrays do not transform into numpy arrays when the result is representable as a numpy array. Requiring the developer to do the conversion explicitly with np.asarray seems not user-friendly. |
Beta Was this translation helpful? Give feedback.
-
The opposite—mixing Awkward-brand and NumPy-brand arrays in return values—was a problem that we wanted to fix from Awkward 0. Not every flatten would return a flat array: >>> array = ak.Array([[[0, 1, 2], []], [], [[3, 4]], [[5], [6, 7, 8, 9]]])
>>> array
<Array [[[0, 1, 2], []], ... 5], [6, 7, 8, 9]]] type='4 * var * var * int64'>
>>> ak.flatten(array)
<Array [[0, 1, 2], [], ... [5], [6, 7, 8, 9]] type='5 * var * int64'>
>>> ak.flatten(ak.flatten(array))
<Array [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] type='10 * int64'> Having return types ( For normal data analysis, though, you shouldn't be extracting things from |
Beta Was this translation helpful? Give feedback.
-
Sorry Jim, but you are closing issues so fast that I feel like you are ending a discussion one-sidedly. In numpy, flatten always reduces to 1D, so if |
Beta Was this translation helpful? Give feedback.
-
I only closed this one and a duplicate (which was resolved with the requested feature, by the way). That's because in this one, you're asking for something that is way outside the design of the library—returning different Python types depending on array type was the bug we fixed with this redesign. In all of these, you are coming from a different point of view: I think your needs are simpler. You have reason to believe that your arrays are ListOffsetArrays of one level deep. If that's always the case, it's less onerous to work with the nodes and indexes directly (including mutation) and you don't need the full abstraction. This redesign was motivated by the problems people were having with more complex cases, Earth missing values, unions, and chunking. There are good reasons to have multiple ways of expressing the same array, with and without an IndexedArray, for instance (which delays rearrangement), but those reasons are technical and would be very distracting in a data analysis. However, this is a difference in layout that could make the difference between an Awkward Array and a NumPy array. Finally, we do make sure that any function with the same name as NumPy has the same behavior. However, NumPy doesn't have a |
Beta Was this translation helpful? Give feedback.
-
I thank you for the explanation, but I am talking about another aspect. Firstly, you are right, numpy does not have a free function called The primary point is what you think My interest in this is purely practical and based on the analysis that my group does in LHCb. I argue that this is a very useful function and it could just do the same for (some) awkward arrays. Whether |
Beta Was this translation helpful? Give feedback.
-
I think of methods vs free functions as being a different interface, though that's a matter of definition. Awkward Arrays don't have So by "generalizing NumPy," we mean "generalizing a subset of NumPy's free functions." Since The feature you want is One of the unpleasant consequences of generalizing NumPy is that reducers like Is the argument here that the default As for its performance characteristics, >>> base = np.arange(2*3*5).reshape(2, 3, 5)
>>> base
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]]])
>>> sliced = base[::-1, ::2, 1:-1]
>>> sliced
array([[[16, 17, 18],
[26, 27, 28]],
[[ 1, 2, 3],
[11, 12, 13]]])
>>> result = sliced.flatten()
>>> result
array([16, 17, 18, 26, 27, 28, 1, 2, 3, 11, 12, 13])
>>> result.base Non-contiguous ListArrays are the moral equivalent of that. |
Beta Was this translation helpful? Give feedback.
-
I'm just adding to this issue now that we're a little further along the line, in case new users discover this discussion. Awkward now has |
Beta Was this translation helpful? Give feedback.
-
What is the intended use case for
ak.flatten
? If it converts an arbitrarily complex awkward array into a flat 1d array, then I think it should return a numpy array, not a 1D awkward array. I wanted to call some of my array transforms, e.g.with
ak.flatten
instead ofnp.asarray(array.layout.content)
, but that still does not work in my code, since neither the type returned by ak.flatten nor by array.layout.content has thedtype
property.Beta Was this translation helpful? Give feedback.
All reactions