You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to understand the reason why hazard pointers require exactly two SeqCst orderings. While thinking about this, I wanted to fully understand the guarantees of SeqCst atomic operations and fences. Intuitively, fences should provide a stronger guarantee due to its name(the term fence gives a very intimidating impression!) and below are my conclusions:
Low-Level Perspective
The difference of a SeqCstfence and SeqCstatomic.store in a compiler's point of view:
fence is lowered to a mfence instruction. The mfence instruction is said to be a serializing instruction. This means that all stores prior to the mfence instruction are committed to the main memory.
store is lowerd to a xchg rax, [rbx]-like instruction. This is interesting, because rax is discarded afterwards. For Release ordering, the store is not lowered to a xchg instruction but a regular store instruction. This implies that there is something 'special' about xchg, which I don't know yet.
High-Level Perspective
All stores with the SeqCst ordering contribute to the global ordering S. All loads with the SeqCst ordering see stores accordig to the global ordering S. The term see means the following: if a thread can see a store, it means that it can observe the effects of that store.
However, this does not mean that a SeqCst load can see the last store within S. It only guarantees that if it sees a certain store s within S, it can always see t such that t < s where '<' represents the ordering within S. In other words, threads see a 'prefix of S, which is at worst nothing (null sequence) and at best the entire sequence S. Let's consider the following example:
fnthread1_func(m:&Arc<AtomicUsize>,n:&Arc<AtomicUsize>){
m.store(1,Ordering::SeqCst);// [1]let val = n.load(Ordering::SeqCst);if val == 0{read(x);}}fnthread2_func(m:&Arc<AtomicUsize>,n:&Arc<AtomicUsize>){
n.store(1,Ordering::SeqCst);// [2]let val = m.load(Ordering::SeqCst);if val == 0{free(x);}}
Let's assume that everything in thread2 was executed before thread1. Then, the global ordering S would be [2] --> [1]. However, the load of n in thread1 can yield 0, if it sees 'nothing' from S. Thus, this results in a UaF and crashes the program.
We can use fences to solve this problem.
fnthread1_func(m:&Arc<AtomicUsize>,n:&Arc<AtomicUsize>){
m.store(1,Ordering::SeqCst);// [1]fence(Ordering::SeqCst);let val = n.load(Ordering::SeqCst);// [3]if val == 0{read(x);}}fnthread2_func(m:&Arc<AtomicUsize>,n:&Arc<AtomicUsize>){
n.store(1,Ordering::SeqCst);// [2]fence(Ordering::SeqCst);let val = m.load(Ordering::SeqCst);// [4]if val == 0{free(x);}}
I understood the semantics of a SeqCst fence as follows: For stores, nothing changes. (assuming all stores had the SeqCst ordering) For loads, an additional condition is imposed. If a SeqCst fence is present before a load, it must see a subsequence of the global ordering S that includes all of the stores before the fence.
Let's see how the problem is solved in this case: Let's consider both cases, where the S is [1] --> [2] and S is [2] --> [1].
S = [1] --> [2]: The load of n in thread1 ([3]) either sees [1] or [1]-->[2]. If it sees [1], the load of n will return 0, and read(x) will be performed. In the other case, read(x) will not be performed. The load of m in thread2 ([4]) must see [1]-->[2] due to the fence. Thus, the load of m will return 1 and free(x) will never be performed. Because read(x) is 'maybe' done and free(x) is never done, the pogram is safe!
S = [2] --> [1]: The load of n in thread1([3]) must see [2]-->[1] due to the fence. Thus, the load of n will return 1 and read(x) will not be performed. The load of m in thread2 ([4]) either sees [2] or [2]-->[1]. If it sees [2], the load of m will return 0 and free(x) will be performed. In the other case, free(x) will not be performed. In conclusion, read(x) will never be performed so safety is guaranteed regardless of whether free(x) is called. Safe!
So in conclusion, fence(SeqCst) imposes some condition for load(SeqCst) and limits how it 'sees' S.
Connecting the Dots
I think it is also important to think why the usage of mfence provides this high-level conditions that I described. Let's see the code before in pseudo-assembly.
thread1:movrax,1xchgrax,[rbx] // atomic store [1]mfencemovrax,[rcx] // atomic load [3]
thread2:movrax,1xchgrax,[rcx] // atomic store [2]mfencemovrax,[rbx] // atomic load [4]
The order between the two xchgs can be arbitrary. But, by mfence, the loads are done strictly after xchg commits to memory. Thus, because there exists an order between xchgs PLUS an order between xchg and load, the high-level properties described previously are obtained.
Conclusions and Thoughts
If any of my discussions are wrong, please tell me where my thoughts are wrong. These are rather unbased thoughts of mine that were created during the process of attempting to understand what fences are.
The text was updated successfully, but these errors were encountered:
Opening
I'm trying to understand the reason why hazard pointers require exactly two
SeqCst
orderings. While thinking about this, I wanted to fully understand the guarantees ofSeqCst
atomic operations and fences. Intuitively, fences should provide a stronger guarantee due to its name(the term fence gives a very intimidating impression!) and below are my conclusions:Low-Level Perspective
The difference of a
SeqCst
fence
andSeqCst
atomic.store
in a compiler's point of view:fence
is lowered to amfence
instruction. Themfence
instruction is said to be a serializing instruction. This means that all stores prior to themfence
instruction are committed to the main memory.store
is lowerd to axchg rax, [rbx]
-like instruction. This is interesting, becauserax
is discarded afterwards. ForRelease
ordering, the store is not lowered to axchg
instruction but a regular store instruction. This implies that there is something 'special' aboutxchg
, which I don't know yet.High-Level Perspective
All stores with the
SeqCst
ordering contribute to the global orderingS
. All loads with theSeqCst
ordering see stores accordig to the global orderingS
. The term see means the following: if a thread can see a store, it means that it can observe the effects of that store.However, this does not mean that a
SeqCst
load can see the last store withinS
. It only guarantees that if it sees a certain stores
withinS
, it can always seet
such thatt
<s
where '<' represents the ordering withinS
. In other words, threads see a 'prefix ofS
, which is at worst nothing (null sequence) and at best the entire sequenceS
. Let's consider the following example:Let's assume that everything in thread2 was executed before thread1. Then, the global ordering
S
would be[2] --> [1]
. However, the load ofn
in thread1 can yield 0, if it sees 'nothing' fromS
. Thus, this results in a UaF and crashes the program.I understood the semantics of a
SeqCst
fence as follows: For stores, nothing changes. (assuming all stores had theSeqCst
ordering) For loads, an additional condition is imposed. If aSeqCst
fence is present before a load, it must see a subsequence of the global orderingS
that includes all of the stores before the fence.Let's see how the problem is solved in this case: Let's consider both cases, where the
S
is[1] --> [2]
andS
is[2] --> [1]
.S
=[1] --> [2]
: The load ofn
in thread1 ([3]
) either sees[1]
or[1]-->[2]
. If it sees[1]
, the load ofn
will return 0, andread(x)
will be performed. In the other case,read(x)
will not be performed. The load ofm
in thread2 ([4]
) must see[1]-->[2]
due to the fence. Thus, the load ofm
will return 1 andfree(x)
will never be performed. Becauseread(x)
is 'maybe' done andfree(x)
is never done, the pogram is safe!S
=[2] --> [1]
: The load ofn
in thread1([3]
) must see[2]-->[1]
due to the fence. Thus, the load ofn
will return 1 andread(x)
will not be performed. The load ofm
in thread2 ([4]
) either sees[2]
or[2]-->[1]
. If it sees[2]
, the load ofm
will return 0 andfree(x)
will be performed. In the other case,free(x)
will not be performed. In conclusion,read(x)
will never be performed so safety is guaranteed regardless of whetherfree(x)
is called. Safe!So in conclusion,
fence(SeqCst)
imposes some condition forload(SeqCst)
and limits how it 'sees'S
.Connecting the Dots
I think it is also important to think why the usage of
mfence
provides this high-level conditions that I described. Let's see the code before in pseudo-assembly.The order between the two
xchg
s can be arbitrary. But, bymfence
, the loads are done strictly afterxchg
commits to memory. Thus, because there exists an order betweenxchg
s PLUS an order betweenxchg
and load, the high-level properties described previously are obtained.Conclusions and Thoughts
If any of my discussions are wrong, please tell me where my thoughts are wrong. These are rather unbased thoughts of mine that were created during the process of attempting to understand what
fence
s are.The text was updated successfully, but these errors were encountered: