You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently I am learning CudaSift source code and may find some unnecessary use of __syncthreads() for some kernel funcitons in matching.cu.
For kernel function FindMaxCorr,
__global__ void FindMaxCorr(float *corrData, SiftPoint *sift1, SiftPoint *sift2, int numPts1, int corrWidth, int siftSize)
{
.........
if (tx==6)
sift1[p1].score = maxScore[ty*16];
if (tx==7)
sift1[p1].ambiguity = maxScor2[ty*16] / (maxScore[ty*16] + 1e-6);
if (tx==8)
sift1[p1].match = maxIndex[ty*16];
if (tx==9)
sift1[p1].match_xpos = sift2[maxIndex[ty*16]].xpos;
if (tx==10)
sift1[p1].match_ypos = sift2[maxIndex[ty*16]].ypos;
__syncthreads();
}
In line 160, before kernel function finished, FindMaxCorr calls __syncthreads(), but what confuses me is that line 160 is the last code kernel function executing, there should be unnecessary to synchronize threads here?
Same issues comes for FindMaxCorr1, FindMaxCorr2, FindMaxCorr3.
Thanks very much! :)
The text was updated successfully, but these errors were encountered:
Yes, there is a bit of cleaning up to do there. Sometimes when I detect oddities in the output, I add an unnecessary synchronization just in case. In fact those things should be all run on the same thread, since it cannot be parallelized anyway. Thank you for pointing it out.
Hello there!
Currently I am learning CudaSift source code and may find some unnecessary use of __syncthreads() for some kernel funcitons in matching.cu.
For kernel function FindMaxCorr,
In line 160, before kernel function finished, FindMaxCorr calls __syncthreads(), but what confuses me is that line 160 is the last code kernel function executing, there should be unnecessary to synchronize threads here?
Same issues comes for FindMaxCorr1, FindMaxCorr2, FindMaxCorr3.
Thanks very much! :)
The text was updated successfully, but these errors were encountered: