Race during insertion of key/values #23

AKKamath · 2021-08-11T20:27:22Z

I noticed that there were races in the code.
For example, in line 45 of src/concurrent_map/warp:

uint32_t src_unit_data = (next == SlabHashT::A_INDEX_POINTER)
? *(getPointerFromBucket(src_bucket, laneId))
: *(getPointerFromSlab(next, laneId));

Here a weak load operation is used to read data from the bucket.
Later on (line 65/line 83), an atomicCAS is performed to the same location.

Different threads may read and atomicCAS on the same location, creating a race.

jowens · 2021-08-24T12:57:29Z

@sashkiani @maawad

jowens · 2021-08-24T12:58:00Z

@AKKamath did you see other races of this type? Thank you for this report!

AKKamath · 2021-08-24T13:03:31Z

This seems repeated in many locations.
For example src/concurrent_set/cset_warp_operations.cuh has a similar code snippet at lines 41 - 43.

maawad · 2021-08-24T13:31:57Z

Hi @AKKamath, although the bucket read operation doesn't perform any synchronization, the atomicCAS operation (where we perform the write) will only succeed if the pointer points to an EMPTY_PAIR_64 (i.e., it will fail if we read an empty value, but before the atomicCAS operation a different thread modified that memory location). If different threads try to perform a CAS operation, only one thread will succeed.

Could you explain what scenario would produce a race condition?

I'm also curious about how you found these multiple race conditions.

AKKamath · 2021-08-24T14:11:03Z

Since the read operation is "weak" it has no consistency guarantees.
While I don't see any obvious race conditions, there has been prior work that shows data races in CUDA code breaks assumptions.
In the same work, they change a weak store to an atomic operation as it was racing with an atomicCAS.

A simple fix would be changing the weak loads to a "strong" operation by typecasting them to volatile, or using an atomic operation instead, e.g. atomicAdd(getPointerFromBucket(src_bucket, laneId), 0)

maawad · 2021-08-24T14:42:22Z

Thanks for sharing that paper. I read it before.
But as I mentioned, even though the read operation is weak, the operation that matters is the atomic CAS operation.

AKKamath · 2021-08-24T15:42:56Z

So by definition (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#conflict-and-data-races), there is a data race between the weak load and atomic.
I guess it could be a benign data race in this case.

jowens · 2021-08-28T18:11:09Z

I continue to be interested in getting to the bottom of this. @maawad is there a data race / is it benign, or is this not a race?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race during insertion of key/values #23

Race during insertion of key/values #23

AKKamath commented Aug 11, 2021

jowens commented Aug 24, 2021

jowens commented Aug 24, 2021

AKKamath commented Aug 24, 2021

maawad commented Aug 24, 2021 •

edited

Loading

AKKamath commented Aug 24, 2021

maawad commented Aug 24, 2021

AKKamath commented Aug 24, 2021

jowens commented Aug 28, 2021

Race during insertion of key/values #23

Race during insertion of key/values #23

Comments

AKKamath commented Aug 11, 2021

jowens commented Aug 24, 2021

jowens commented Aug 24, 2021

AKKamath commented Aug 24, 2021

maawad commented Aug 24, 2021 • edited Loading

AKKamath commented Aug 24, 2021

maawad commented Aug 24, 2021

AKKamath commented Aug 24, 2021

jowens commented Aug 28, 2021

maawad commented Aug 24, 2021 •

edited

Loading