Skip to content

Conversation

jxs
Copy link
Member

@jxs jxs commented Sep 16, 2025

Description

This is a draft implementation of partial messages for gossipsub following the spec PR and based on the Go implementation. Still WIP but should give a good idea of the direction we're heading.

{
// Return err if trying to publish the same partial message state we currently have.
if existing.available_parts() == partial_message.available_parts() {
return Err(PublishError::Duplicate);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct.

  • Imagine you have parts 1,2,3.
  • You tell your peers about those parts.
  • A peer comes back and says I want part 2.
  • You republish with the same parts in order to respond.
  • You get this error and fail to respond.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for explaining Marco, updated as we spoke

data_transform: D,

/// Partial messages received.
partial_messages: HashMap<TopicHash, HashMap<Vec<u8>, P>>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to store P here? I think it's better if P is owned solely by the application.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, you are right, thanks Marco!

pub(crate) struct PartialData {
pub(crate) ihave: Vec<u8>,
pub(crate) iwant: Vec<u8>,
pub(crate) message: Vec<u8>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it useful to store the message here? It seems like wasted space, you only use it to check if the peer is sending you a duplicate.

Might be simpler to let the application handle dupes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Marco, I only left wanted and has wanted to avoid sending the same message and has to avoid notifying the application layer of duplicates.
Thanks!

@jxs jxs force-pushed the gossipsub-partial-messages branch 5 times, most recently from cb0e925 to e6f1ae4 Compare September 26, 2025 11:35
@jxs jxs force-pushed the gossipsub-partial-messages branch from e6f1ae4 to 69c2d95 Compare September 26, 2025 14:24
Copy link
Member

@dknopik dknopik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts and some nitpicks.

#[derive(Debug)]
pub enum RpcOut {
/// Publish a Gossipsub message on network.`timeout` limits the duration the message
/// PublishV a Gossipsub message on network.`timeout` limits the duration the message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental change

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, updated!


/// Returns metadata describing which parts of the message are available and which parts we want.
///
/// This metadata is application-defined and should encode information about
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial sentence

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Daniel, updated

Comment on lines 717 to 718
// If partial set, filter out peers who only want partial messages for the topic.
fn get_publish_peers(&mut self, topic_hash: &TopicHash, partial: bool) -> HashSet<PeerId> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that partial being true causes the filtering is confusing, as this is set to true on non-partial messages and vice-versa. I'd either call this filter_partial or invert the condition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, agreed and updated. Thanks Daniel!

/// - Optional remaining metadata if more parts are still available after this one
fn partial_message_bytes_from_metadata(
&self,
metadata: Option<impl AsRef<[u8]>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unclear to me what the expected behaviour is if metadata is None. I guess usually send everything we have to the peer?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Ethereum this is the case where we still haven't received a peer's metadata and could maybe eager push data to them. There are, at a high level, 3 options:

  • Don't return anything.
  • Return everything (probably not what we want to do)
  • Return cells we didn't have locally (cells that we got from the network, not getBlobs).

fn partial_message_bytes_from_metadata(
&self,
metadata: Option<impl AsRef<[u8]>>,
) -> Result<(impl AsRef<[u8]>, Option<impl AsRef<[u8]>>), PartialMessageError>;
Copy link
Member

@dknopik dknopik Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, if we have nothing to send to the peer, we return Ok((vec![], metadata)). This seems unintuitive and potentially inefficient to me, as we have to clone metadata and publish_partial has to compare it to the previous value. Maybe change it to allow returning Ok(None) to signal this?

timeout: Delay::new(self.config.publish_queue_duration()),
RpcOut::PartialMessage {
message: message_data,
metadata: partial_message.parts_metadata().as_ref().to_vec(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe get partial_message.parts_metadata() before the loop to avoid useless re-encoding (in case the implementation encodes ad-hoc)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah nice catch, before each peer had its own metadata so this was not possible. Updated!

///
/// Returns `Ok(())` if the data was successfully integrated, or `Err`,
/// if the data was invalid or couldn't be processed.
fn extend_from_encoded_partial_message(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused as far as I can tell. Yes, something like this is likely needed by the application, but as the behaviour does not need to access this, we can leave the exact interface up to the application (as maybe some external info is needed to extend the partial message).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, good point. Removed it

dknopik

This comment was marked as duplicate.

self.leave(&topic_hash);
#[cfg(feature = "partial_messages")]
{
self.partial_only_topics.insert(topic_hash.clone());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be remove?

pub(crate) struct PartialData {
/// The current peer partial metadata.
pub(crate) metadata: Option<Vec<u8>>,
/// The remaining heartbeats for this message to be deleted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should not be Option. I can't think of a meaningful case where we have None associated with a peer instead of no entry at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, we can represent that we sent all data with None.

@MarcoPolo
Copy link

My review comments as a patch: https://notes.marcopolo.io/view/d804e35e05fa79cfc1ed38e920aa01f1

Import review commit locally:

curl https://notes.marcopolo.io/raw/d804e35e05fa79cfc1ed38e920aa01f1 | git am

Feel free to respond inline with a new patch or quote relevant sections here and respond.

(I'm experimenting with alternative code-review processes, this is a rough prototype of reviews as separate commits. Feedback welcome)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants