-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect loop axis reference #445
Comments
Also the |
@seanlatias thoughts? |
I'm not sure how TVM approaches this but I think ideally we don't want to update the axes since they are referred from the original program. Think about if we split the first loop and the second loop of the original program, the results should be the same no matter we split the first loop first or the second loop first. However, if we always update the axes, we need to change the way we write. Following is a more concrete example. (a)
s[C].split(C.axis[0])
s[C].split(C.axis[1])
(b)
s[C].split(C.axis[1])
s[C].split(C.axis[0]) If the axis is not updated, program (a) should be the same as program (b). However, if the axes are updated, then (a) is no longer the same as (b), which is counter-intuitive (to me at least). We can maintain a new variable like |
It makes sense. (a) and (b) should generate the same code. The newly generated loops should always be accessed by the returned variables like the code below. outer, inner = s[C].split(C.axis[0])
s[C].split(inner) |
It is weird that the following two code snippet represents different meanings. The first approach does not add the newly generated loop to
axis
, soC.axis[1]
refers to the original 1st axis. However, the second approach considers the newly generated loop, so1
refers to the split inner loop.The test case is from
test_schedule_compute.py
.The first approach generates the following code.
And the second approach generates the code below.
I think in this case, the second approach is correct. Since after calling
s[C].split(C.axis[0], factor=3)
, we have new loopsii.outer
andii.inner
which should be attached to theC.axis
. ThenC.axis[1]
should refer to theii.inner
one instead of the originaljj
one. Basically, the test program only tests ifallocate B[int32 * 1 * 1 * 1]
exists, so it directly passes without catching the error.The text was updated successfully, but these errors were encountered: