-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SHM: optimize metadata #1714
SHM: optimize metadata #1714
Conversation
… into sigle "Metadata" structure
- change three_node_combination_multicast ports
…ead and less structure sizes for SHM
Put ProtocolID and ChunkDescriptor into SHM metadata: less wire overhead and less internal structure sizes for SHM Gains 2% more performance for SHM, but this is not the only thing: it opens a way to implement publish-before-allocated-and-written concept, that should dramatically increase SHM performance and improve latency! |
@@ -0,0 +1,58 @@ | |||
// | |||
// Copyright (c) 2023 ZettaScale Technology |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure it's 2025 for newly-created files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
#[stabby::stabby] | ||
pub struct Metadata<const S: usize> { | ||
headers: [ChunkHeaderType; S], | ||
watchdogs: [AtomicU64; S], // todo: replace with (S + 63) / 64 when Rust supports it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious to know more about the todo comment here. What's the intended behaviour and current limitations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watchdogs are packed as bits of u64
. This means that for S
headers we need S/64
-sized u64
array. However, stable Rust doesn't support const arithmetic for const generic parameter like this:
watchdogs: [AtomicU64; (S + 63) / 64]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation
) | ||
} | ||
|
||
pub fn count(&self) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If count
is const, couldn't be an associated const
instead of a function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in a bit different way: pub const fn
The reason is that associated constant cannot be called through object instance, only as Type::constant
, so I need to carry full object type (and it is compilcated) everywhere
const fn
is intended to work in the way we need
LGTM. I'll just wait to merge it because of the |
Sure, that's intended as I didn't put "release" label :) |
@yellowhatter It's well understood that SHM is still marked as unstable, meaning that breaking changes may be introduced. |
@Mallets The idea is clear. I think we can add smth similar to |
I think it makes sense. This would allow us to better control the introduction of additional changes in the future. |
I have recently added versioning support. There is no additional OAM extension - I made version negotiation through SHM probing segment. That is more flexible and saves bytes on wire |
commons/zenoh-shm/src/lib.rs
Outdated
self.header.header().generation.load(Ordering::SeqCst) == self.info.generation | ||
let header = self.metadata.owned.header(); | ||
|
||
header.generation.load(Ordering::SeqCst) == self.info.generation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here two atomic operations are performed but not synchronised. What is the guarantee that watchdog_invalidated
refers to the same header.generation
?
It could happen that header.generation.load(Ordering::SeqCst) == self.info.generation
but header.watchdog_invalidated.load(Ordering::SeqCst)
(i.e. is invalidated). But between the first and the second atomic operation, the generation header could change and the watchdog_invalidated
refers to a new state?
Is this a possible scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. I will fix it, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
watchdog_mask: u64, | ||
} | ||
|
||
unsafe impl Send for OwnedMetadataDescriptor {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the guarantees of these unsafe impls
for Send
and Sync
traits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed these unsafe impls
and made OwnedMetadataDescriptor
to be Send
and Sync
fairly
|
||
#[inline(always)] | ||
pub fn header(&self) -> &ChunkHeaderType { | ||
unsafe { &(*self.header) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add # SAFETY
comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore, removed this unsafe
} | ||
|
||
pub(crate) fn validate(&self) -> u64 { | ||
unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add # SAFETY
comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore, removed this unsafe
} | ||
|
||
pub fn confirm(&self) { | ||
unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add # SAFETY
comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore, removed this unsafe
core::cmp::Ordering::Equal => {} | ||
ord => return ord, | ||
} | ||
self.watchdog_mask.cmp(&other.watchdog_mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a very stylistic comment but this check could be moved to the line 94 as part of the Equal
branch body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second point, what would be the case that the same watchdog_atomic
pointer is associated to differentwatchdog_mask
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style fixed
For the masks and watchdog_atomic
: watchdogs are packed as bits of u64. Mask defines the exact bit within this u64. Descriptors with the same watchdog_atomic
but different masks are different descriptors. I'm thinking of storing mask as u8 carrying the exact bit position, but it is a matter of further micro optimization and should be validated
.header() | ||
.generation | ||
.fetch_add(1, std::sync::atomic::Ordering::SeqCst); | ||
let mut guard = self.available.lock().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In zenoh-core
there is the zlock!
macro, which allows to quickly instrument any lock in the whole Zenoh codebase if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
pub fn validate_challenge(&self, expected_challenge: AuthChallenge, s: &str) -> bool { | ||
let challnge_in_shm = self.challenge(); | ||
if challnge_in_shm != expected_challenge { | ||
tracing::trace!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be promoted to debug!
level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made debug
|
||
let version_in_shm = unsafe { *self.array.elem(VERSION_INDEX) }; | ||
if version_in_shm != SHM_VERSION { | ||
tracing::trace!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be promoted to debug!
level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made debug
- get rid of pointers in OwnedMetadataDescriptor - get rid of some "unsafe" in OwnedMetadataDescriptor
Co-authored-by: Luca Cominardi <[email protected]>
Co-authored-by: Luca Cominardi <[email protected]>
Co-authored-by: Luca Cominardi <[email protected]>
@yellowhatter it seems the latest changes make some tests fail in the CI. |
yes, I'm working on it |
[closes #1563] |
closes #1703
Combined SHM metadata in single storage that is now identified with single descriptor.
By-the-way fixed undiscovered edge case bug with buffer validity check in the receiver code.
Improves SHM performance by 35% (1.7M -> 2.3M) that moves SHM efficiency threshold to somewhat less than 512 bytes.
Breaks backward compatibility for SHM communication. Non-SHM is not affected.