Replies: 3 comments 5 replies
-
The easiest way assumes something that I don't know, but I think it might be true: you got interesting = ak.Array([[True, False, True], [False, True], [False]]) (where If so, then the easiest way to do what you want to do uses that boolean array in ak.where: A_final = ak.where(interesting, A_mod_aligned, A) where A_selected_aligned = A.mask[interesting]
A_mod_aligned = some_computation(A_selected_aligned) and A_selected_aligned = ak.Array([[21.1, None, 43.1], [None, 30.4], [None]]) The mask-slice keeps all list lengths the same by inserting All of the above is based on my assumptions about how you got to the problem you raised. I haven't addressed the problem you raised directly; it's more complicated. I'm going to think about how to solve the exact problem you stated, but the solution isn't going to be as straightforward. (If my assumptions above are true, you'll want to use the above!) |
Beta Was this translation helpful? Give feedback.
-
Okay, I have a solution to the problem that you stated, and it's complicated because the First, although you gave me Now for the actual problem. A_idx = ak.Array( [ [0,1,2], [0,1], [0] ] )
A_mod_idx = ak.Array( [ [0,2], [1], [] ] ) happen to be ordered subsets, but I'm not going to assume that they're ordered. To find how they match up in general, I'll use ak.cartesian to make tables for each list of indexes from >>> ak.cartesian([A_idx, A_mod_idx], nested=True, axis=1).show()
[[[(0, 0), (0, 2)], [(1, 0), (1, 2)], [(2, 0), (2, 2)]],
[[(0, 1)], [(1, 1)]],
[[]]]
>>> x, y = ak.unzip(ak.cartesian([A_idx, A_mod_idx], nested=True, axis=1))
>>> x
<Array [[[0, 0], [1, 1], [2, 2]], ..., [[]]] type='3 * var * var * int64'>
>>> y
<Array [[[0, 2], [0, 2], [0, 2]], ..., [[]]] type='3 * var * var * int64'> In each list at >>> (x == y).show()
[[[True, False], [False, False], [False, True]],
[[False], [True]],
[[]]] The " >>> interesting = ak.any(x == y, axis=2)
>>> interesting.show()
[[True, False, True],
[False, True],
[False]] To reverse the pointers, we need to know: at what position in the >>> ak.argmax((x == y), axis=2)
<Array [[0, 0, 1], [0, 0], [None]] type='3 * var * ?int64'> Each element of this array is either the position of the only true or the position of the first false, since ak.argmax only finds the biggest value, even if the biggest value is 0. The list that was actually empty has no maximum, so we get Let's make the non-empty lists in which there was no match also >>> A_idx_aligned = ak.argmax((x == y), axis=2).mask[interesting]
>>> A_idx_aligned
<Array [[0, None, 1], [None, 0], [None]] type='3 * var * ?int64'> These are the inverted pointers: >>> A_mod_aligned = A_mod[A_idx_aligned]
>>> A_mod_aligned.show()
[[1.1, None, 2.1],
[None, 3.1],
[None]] Now we can use ak.where, as I described in my previous answer: >>> ak.where(interesting, A_mod_aligned, A).show()
[[1.1, 22.1, 2.1],
[73.1, 3.1],
[43.1]]
>>> A_final.show()
[[1.1, 22.1, 2.1],
[73.1, 3.1],
[43.1]] But this is not a "simple problem," as you expected. If you had the equivalent of the |
Beta Was this translation helpful? Give feedback.
-
I also wanted to look at this because I'm trying to fine-tune ChatGPT models to see if they can answer these sorts of questions. I have a sample of 93 questions I answered on StackOverflow that I've been using as fine-tuning data and the results have been mixed/inconclusive so far. Your question sounds particularly hard for something like this, so I thought I'd give it a try. Don't believe these answers! No fine-tuning (gpt-4o, temperature=0.7)ChatGPT response
AnalysisThis is not wrong! It's missing the fact that you want a vectorized approach and it's dropping down into pure Python. (Surely, there's a lot of pure Python in its dataset.) But, apart from that, it absolutely works and it would be the right pure Python approach. It picked out only the arrays from your statement of the problem that are needed in the calculation. The The explanation after the code sample is good/not misleading. Pretty good! Maybe if there were more Awkward Array examples in the training data, it might have gotten this one right, without falling back on pure Python. With fine-tuning (base model is gpt-4o, temperature=0.7)ChatGPT response
Analysis
The duplication of The calculation of >>> ak.local_index(A, axis=1)[:, :, None].show()
[[[0], [1], [2]],
[[0], [1]],
[[0]]] and >>> A_mod_idx.show()
[[0, 2],
[1],
[]] The It seems to be thinking that finding an equality between the indexes in After that, the The >>> mask
<Array [[False, True, ...], [True], [True]] type='3 * var * bool'> that's just what it was hoping it would be. (This anthropomorphization is not intended seriously.) Then One thing that's interesting: I ran the no-fine-tuning model 5 more times and the with-fine-tuning model 5 more times (both with temperature=0.7) and all of the no-fine-tuning responses used for loops and all of the with-fine-tuning responses were vectorized. (I didn't check them both for correctness; it takes too long.) From the fine-tuning, it learned that vectorized solutions are good and for loops are bad, but it's not getting the vectorized solutions correct. (Admittedly, this is a hard problem.) ConclusionsThis is one more data point against using LLMs for directly answering Awkward questions, even with fine-tuning. It is my first evidence that the fine-tuning did something: your problem nudges the generic ChatGPT toward a for-loop solution because your problem is not too hard if for loops are allowed, and the fine-tuned model doesn't. As I've been finding with other examples, evaluating the correctness or incorrectness of answers takes way too much time. Papers on LLMs cite standard question-and-answer datasets, like the Jeopardy! dataset, so the "grading" is immediate and they can spend more time tuning hyperparameters. It takes me several minutes to determine if one answer is correct. (But I don't want to tune this for Jeopardy! questions; I want to tune it for Awkward Array, or for general HEP-software questions!) |
Beta Was this translation helpful? Give feedback.
-
Hello Experts,
I have encountered a problem recently. Probably a simple one.
I want to replace some elements of an array with some modified values. e.g.
As I tried to explain, I want to replace the elements of A by A_mod at the position of the selected indices in the original array. Any sort of idea would be very helpful.
Thank you very much
Beta Was this translation helpful? Give feedback.
All reactions