Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download blobs from the node that gave us the certificate. #3034

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 29 additions & 9 deletions linera-core/src/client/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1185,7 +1185,8 @@ where
let block_chain_id = certificate.value().chain_id();
let block_height = certificate.value().height();

self.receive_certificate_internal(certificate, mode).await?;
self.receive_certificate_internal(certificate, mode, None)
.await?;

// Make sure a quorum of validators (according to our new local committee) are up-to-date
// for data availability.
Expand All @@ -1207,6 +1208,7 @@ where
&self,
certificate: Certificate,
mode: ReceiveCertificateMode,
nodes: Option<Vec<RemoteNode<P::Node>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Download the blob in question from the validator that sent us the certificate and is therefore expected to have it.

This code looks like it does the opposite, i.e. downloading the certificate from the validator that has the blob. What am I missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The certificate itself—the one we downloaded because of the blob in question—is passed into this function. But the function also automatically downloads all the ancestors and blobs, if needed.

It currently downloads those from all validators, sequentially. With this change, if a node is provided explicitly, it downloads them only from that node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might lead to some performance degradation if this one particular node becomes a bottleneck for this function call. Although IDK how to trigger that (to check) or fix the issue differently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly, that's a more general and far-reaching discussion we should have at some point: The right tradeoff between latency (ask all validators everything at the same time, take the first response) and bandwidth (ask them in sequence).

But at least it triggers exactly that bottleneck way less frequently that before: Before, we were simply asking the wrong validator often, and then, due to #3033, waiting a long time for it.

) -> Result<(), ChainClientError> {
let CertificateValue::ConfirmedBlock { executed_block, .. } = certificate.value() else {
return Err(ChainClientError::InternalError(
Expand All @@ -1227,9 +1229,13 @@ where
if let ReceiveCertificateMode::NeedsCheck = mode {
certificate.check(remote_committee)?;
}
// Recover history from the network. We assume that the committee that signed the
// certificate is still active.
let nodes = self.make_nodes(remote_committee)?;
// Recover history from the network.
let nodes = if let Some(nodes) = nodes {
nodes
} else {
// We assume that the committee that signed the certificate is still active.
self.make_nodes(remote_committee)?
};
self.client
.download_certificates(&nodes, block.chain_id, block.height)
.await?;
Expand Down Expand Up @@ -1443,7 +1449,10 @@ where
for certificate in certificates.into_values() {
let hash = certificate.hash();
let mode = ReceiveCertificateMode::AlreadyChecked;
if let Err(e) = client.receive_certificate_internal(certificate, mode).await {
if let Err(e) = client
.receive_certificate_internal(certificate, mode, None)
.await
{
warn!("Received invalid certificate {hash}: {e}");
}
}
Expand Down Expand Up @@ -1807,15 +1816,26 @@ where
for blob_id in blob_ids {
let mut certificate_stream = validators
.iter()
.map(|remote_node| remote_node.download_certificate_for_blob(blob_id))
.map(|remote_node| async move {
let cert = remote_node.download_certificate_for_blob(blob_id).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still keep the old semantics of running all futures in parallel thanks to async move?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it maps each remote_node to a future (the async move block) and then creates the FuturesUnordered from them.

Ok::<_, NodeError>((remote_node.clone(), cert))
})
.collect::<FuturesUnordered<_>>();
loop {
let Some(result) = certificate_stream.next().await else {
missing_blobs.push(blob_id);
break;
};
if let Ok(cert) = result {
if self.receive_certificate(cert).await.is_ok() {
if let Ok((remote_node, cert)) = result {
if self
.receive_certificate_internal(
cert,
ReceiveCertificateMode::NeedsCheck,
Some(vec![remote_node]),
)
.await
.is_ok()
{
break;
}
}
Expand Down Expand Up @@ -2578,7 +2598,7 @@ where
&self,
certificate: Certificate,
) -> Result<(), ChainClientError> {
self.receive_certificate_internal(certificate, ReceiveCertificateMode::NeedsCheck)
self.receive_certificate_internal(certificate, ReceiveCertificateMode::NeedsCheck, None)
.await
}

Expand Down
Loading