Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CX_HARDENING] Enhance DA Task Test Coverage #3568

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions crates/hotshot/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -639,6 +639,31 @@ impl<TYPES: NodeType, I: NodeImplementation<TYPES>, V: Versions> SystemContext<T
}
}

#[cfg(feature = "hotshot-testing")]
impl<TYPES: NodeType, I: NodeImplementation<TYPES>, V: Versions> SystemContext<TYPES, I, V> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just add this to the existing trait impl and enable the method when testing is enabled instead of doing separate bespoke impl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should actually just go in testing/src/helpers.rs (ungated), next to build_system_handle (which is what we use currently in the tests). I'd call it build_inactive_handle or something (I think "lean" is slightly misleading, because it's just a dummy handle that isn't running anything).

I made an issue a while ago to do something like this for the non-integration tests (all the task-specific tests in tests_1), so I think this is good to have in general

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good One @ss-es, I wanted to acknowledge the suggestion to place the test-specific helper function in testing/src/helpers.rs. During my implementation, I encountered issues due to the private and public(crate) visibility of dependent fields in pub struct SystemContext<> and pub struct SystemContextHandle<>. To avoid modifying the existing visibility of these fields, I opted for an alternative approach.

/// Creates a lightweight version of the system handle for task state testing.
///
/// This method provides a minimal context for task state tests, omitting the full
/// consensus and network task launches. It results in fewer traces and simplifies debugging.
///
/// # Returns
/// A `SystemContextHandle<TYPES, I, V>` with minimal setup for task state testing.
pub fn create_lean_test_handle(&self) -> SystemContextHandle<TYPES, I, V> {
let (internal_sender, internal_receiver) = broadcast(EVENT_CHANNEL_SIZE);

SystemContextHandle {
consensus_registry: ConsensusTaskRegistry::new(),
network_registry: NetworkTaskRegistry::new(),
output_event_stream: self.external_event_stream.clone(),
internal_event_stream: (internal_sender, internal_receiver.deactivate()),
hotshot: self.clone().into(),
storage: Arc::clone(&self.storage),
network: Arc::clone(&self.network),
memberships: Arc::clone(&self.memberships),
}
}
}

/// An async broadcast channel
type Channel<S> = (Sender<Arc<S>>, Receiver<Arc<S>>);

Expand Down
190 changes: 186 additions & 4 deletions crates/testing/src/helpers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,17 @@
// along with the HotShot repository. If not, see <https://mit-license.org/>.

#![allow(clippy::panic)]
use std::{fmt::Debug, hash::Hash, marker::PhantomData, sync::Arc};
use std::{fmt::Debug, hash::Hash, marker::PhantomData, sync::Arc, time::Duration};

use async_broadcast::{Receiver, Sender};
use async_compatibility_layer::art::async_timeout;
use bitvec::bitvec;
use committable::Committable;
use ethereum_types::U256;
use futures::StreamExt;
use hotshot::{
traits::{NodeImplementation, TestableNodeImplementation},
types::{BLSPubKey, SignatureKey, SystemContextHandle},
types::{BLSPubKey, Event, SignatureKey, SystemContextHandle},
HotShotInitializer, Memberships, SystemContext,
};
use hotshot_example_types::{
Expand All @@ -40,11 +42,15 @@ use hotshot_types::{
utils::{View, ViewInner},
vid::{vid_scheme, VidCommitment, VidSchemeType},
vote::{Certificate, HasViewNumber, Vote},
PeerConfig,
};
use jf_vid::VidScheme;
use serde::Serialize;

use crate::test_builder::TestDescription;
use crate::{
predicates::PredicateResult, script::ExternalEventsExpectations, test_builder::TestDescription,
test_launcher::TestLauncher,
};

/// create the [`SystemContextHandle`] from a node id
/// # Panics
Expand Down Expand Up @@ -128,6 +134,105 @@ pub async fn build_system_handle<
.expect("Could not init hotshot")
}

/// create the [`SystemContextHandle`] from a launcher
/// # Panics
/// if cannot create a [`HotShotInitializer`]
pub async fn build_system_handle_from_launcher<
Copy link
Contributor

@jparr721 jparr721 Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant to make such a specific change for the tests in this case, especially since you have them working and debugging is not as big of an issue at this point. While it's not ideal to spin up the entire SystemContext for unit tests, this method unfortunately creates a rift in what is the accepted testing framework, and what is done here. If we made this change, we'd want to propagate it to all tests, lest we end up in the situation where specific knowledge of this specific suite rests solely with someone who is not on the team.

I am not opposed to the outright removal of this in theory, though I think for this PR, we should stick to the existing spawner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jparr721, I understand your concerns about consistency in our testing framework. To clarify, the existing hotshot-testing::helpers::build_system_handle() function remains unchanged, and all current test cases continue to work as expected.

The new API, hotshot-testing::helpers::build_system_handle_from_launcher(), was added to address issue #3160, which requires 'Permute the inputs and verify if the tests still pass.'. While build_system_handle() uses default test configuration parameters, the new function provides flexibility by allowing modifications to the setup.

I suggest we review its usage in the test scenarios. If it doesn’t add significant value, we can consider removing it. Specifically,

  • Validating vote collection and processing.
  • Ensuring non-leader nodes correctly handle DA votes.

Copy link
Contributor

@jparr721 jparr721 Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use the random! macro to get permuted inputs in that case? Furthermore, we do handle the vote collection and DA votes within the integration tests (e.g. test_success), so spinning up additional nodes for those might not be explicitly necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @jparr721, for your suggestion. I agree that utilizing random!() could be an effective way to introduce fuzziness in test configuration parameters. I will explore this approach and provide an update on the outcome.

TYPES: NodeType<InstanceState = TestInstanceState>,
I: NodeImplementation<
TYPES,
Storage = TestStorage<TYPES>,
AuctionResultsProvider = TestAuctionResultsProvider<TYPES>,
> + TestableNodeImplementation<TYPES>,
V: Versions,
>(
node_id: u64,
launcher: TestLauncher<TYPES, I, V>,
membership_config: Option<Memberships<TYPES>>,
) -> Result<SystemContextHandle<TYPES, I, V>, anyhow::Error> {
tracing::info!("Creating system handle for node {}", node_id);

let network = (launcher.resource_generator.channel_generator)(node_id).await;
let storage = (launcher.resource_generator.storage)(node_id);
let marketplace_config = (launcher.resource_generator.marketplace_config)(node_id);
let config = launcher.resource_generator.config.clone();

let known_nodes_with_stake = config.known_nodes_with_stake.clone();
let private_key = config.my_own_validator_config.private_key.clone();
let public_key = config.my_own_validator_config.public_key.clone();

let memberships = membership_config.unwrap_or_else(|| {
let create_membership = |nodes: &[PeerConfig<TYPES::SignatureKey>], topic: Topic| {
TYPES::Membership::create_election(
known_nodes_with_stake.clone(),
nodes.to_vec(),
topic,
config.fixed_leader_for_gpuvid,
)
};

Memberships {
quorum_membership: create_membership(&known_nodes_with_stake, Topic::Global),
da_membership: create_membership(&config.known_da_nodes, Topic::Da),
vid_membership: create_membership(&known_nodes_with_stake, Topic::Global),
view_sync_membership: create_membership(&known_nodes_with_stake, Topic::Global),
}
});

let initializer = HotShotInitializer::<TYPES>::from_genesis(TestInstanceState::new(
launcher.metadata.async_delay_config,
))
.await
.unwrap();

let system_context = SystemContext::new(
public_key,
private_key,
node_id,
config,
memberships,
network,
initializer,
ConsensusMetricsValue::default(),
storage,
marketplace_config,
);

let system_context_handle = system_context.create_lean_test_handle();

tracing::info!("Successfully created system handle for node {}", node_id);
Ok(system_context_handle)
}

/// create certificate
/// # Panics
/// if we fail to sign the data
pub fn build_da_cert<
TYPES: NodeType<SignatureKey = BLSPubKey>,
DATAType: Committable + Clone + Eq + Hash + Serialize + Debug + 'static,
VOTE: Vote<TYPES, Commitment = DATAType>,
CERT: Certificate<TYPES, Voteable = VOTE::Commitment>,
>(
data: DATAType,
membership: &TYPES::Membership,
view: TYPES::Time,
public_key: &TYPES::SignatureKey,
private_key: &<TYPES::SignatureKey as SignatureKey>::PrivateKey,
) -> CERT {
let real_qc_sig =
build_da_assembled_sig::<TYPES, VOTE, CERT, DATAType>(&data, membership, view);

let vote =
SimpleVote::<TYPES, DATAType>::create_signed_vote(data, view, public_key, private_key)
.expect("Failed to sign data!");
CERT::create_signed_certificate(
vote.date_commitment(),
vote.date().clone(),
real_qc_sig,
vote.view_number(),
)
}

/// create certificate
/// # Panics
/// if we fail to sign the data
Expand Down Expand Up @@ -171,6 +276,46 @@ pub fn vid_share<TYPES: NodeType>(
.clone()
}

/// create DA signature
/// # Panics
/// if fails to convert node id into keypair
pub fn build_da_assembled_sig<TYPES, VOTE, CERT, DATAType>(
data: &DATAType,
membership: &TYPES::Membership,
view: TYPES::Time,
) -> <TYPES::SignatureKey as SignatureKey>::QcType
where
TYPES: NodeType<SignatureKey = BLSPubKey>,
VOTE: Vote<TYPES>,
CERT: Certificate<TYPES, Voteable = VOTE::Commitment>,
DATAType: Committable + Clone + Eq + Hash + Serialize + Debug + 'static,
{
let stake_table = membership.committee_qc_stake_table();
let total_nodes = stake_table.len();
let threshold = CERT::threshold(membership) as usize;
let real_qc_pp =
TYPES::SignatureKey::public_parameter(stake_table.clone(), U256::from(threshold as u64));

let mut signers = bitvec![0; total_nodes];
let sig_lists: Vec<_> = (0..threshold)
.map(|node_id| {
let (private_key, public_key) = key_pair_for_id(node_id as u64);
let vote = SimpleVote::<TYPES, DATAType>::create_signed_vote(
data.clone(),
view,
&public_key,
&private_key,
)
.expect("Failed to sign data");

signers.set(node_id, true);
vote.signature()
})
.collect();

TYPES::SignatureKey::assemble(&real_qc_pp, signers.as_bitslice(), &sig_lists)
}

/// create signature
/// # Panics
/// if fails to convert node id into keypair
Expand Down Expand Up @@ -330,7 +475,7 @@ pub fn build_da_certificate(
payload_commit: da_payload_commitment,
};

build_cert::<TestTypes, DaData, DaVote<TestTypes>, DaCertificate<TestTypes>>(
build_da_cert::<TestTypes, DaData, DaVote<TestTypes>, DaCertificate<TestTypes>>(
da_data,
da_membership,
view_number,
Expand Down Expand Up @@ -393,3 +538,40 @@ pub fn build_fake_view_with_leaf_and_state(
},
}
}

pub async fn check_external_events<TYPES: NodeType, S: StreamExt<Item = Event<TYPES>> + Unpin>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious as to what this is doing that isn't handled by the testing macro?

Copy link
Contributor Author

@shamb0 shamb0 Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jparr721, I introduced a new function, fn check_external_events(), to validate events from the external event stream. This function provides additional validation that complements the existing testing macro. I've also created a simplified block diagram to illustrate its role within the TaskState validation context.

image

Note:: The dotted boxes labeled 'network layer' and 'external service layer' are not relevant to the 'TaskState' validation context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See remark below. This is not necessary since the network task will transform the raw event into a HotShot event, so the checks will be redundant.

mut output_stream: S,
expectations: &[ExternalEventsExpectations<TYPES>],
timeout: Duration,
) -> Result<(), String> {
let mut external_event_expectations_met = vec![false; expectations.len()];

while let Ok(Some(ext_event_received_output)) =
async_timeout(timeout, output_stream.next()).await
{
tracing::debug!("Test received Ext Event: {:?}", ext_event_received_output);
for (index, expectation) in expectations.iter().enumerate() {
if !external_event_expectations_met[index] {
for predicate in &expectation.output_asserts {
if predicate
.evaluate(&Arc::new(ext_event_received_output.clone()))
.await
== PredicateResult::Pass
{
external_event_expectations_met[index] = true;
break;
}
}
}
}
if external_event_expectations_met.iter().all(|&x| x) {
return Ok(());
}
}

if external_event_expectations_met.iter().all(|&x| x) {
Ok(())
} else {
Err("Not all external event expectations were met".to_string())
}
}
50 changes: 49 additions & 1 deletion crates/testing/src/predicates/event.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,14 @@ use async_trait::async_trait;
use hotshot_task_impls::events::{HotShotEvent, HotShotEvent::*};
use hotshot_types::{
data::null_block,
event::Event,
traits::{block_contents::BlockHeader, node_implementation::NodeType},
};

use crate::predicates::{Predicate, PredicateResult};
use crate::{
predicates::{Predicate, PredicateResult},
script::ExternalEventsExpectations,
};

type EventCallback<TYPES> = Arc<dyn Fn(Arc<HotShotEvent<TYPES>>) -> bool + Send + Sync>;

Expand Down Expand Up @@ -294,3 +298,47 @@ where
});
Box::new(EventPredicate { check, info })
}

// New predicate type for EventType
type ExternalEventCheckFn<TYPES> = dyn Fn(&Event<TYPES>) -> bool + Send + Sync;

pub struct ExternalEventPredicate<TYPES: NodeType> {
check: Arc<ExternalEventCheckFn<TYPES>>,
info: String,
}

impl<TYPES: NodeType> std::fmt::Debug for ExternalEventPredicate<TYPES> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", self.info)
}
}

#[async_trait]
impl<TYPES: NodeType + Send + Sync + 'static> Predicate<Arc<Event<TYPES>>>
for ExternalEventPredicate<TYPES>
{
async fn evaluate(&self, input: &Arc<Event<TYPES>>) -> PredicateResult {
PredicateResult::from((self.check)(input))
}

async fn info(&self) -> String {
self.info.clone()
}
}

pub fn ext_event_exact<TYPES: NodeType>(
event: Event<TYPES>,
) -> Box<dyn Predicate<Arc<Event<TYPES>>>> {
let info = format!("{:?}", event);
let check = Arc::new(move |e: &Event<TYPES>| {
e.event == event.event && e.view_number == event.view_number
});

Box::new(ExternalEventPredicate { check, info })
}

pub fn expect_external_events<TYPES: NodeType>(
predicates: Vec<Box<dyn Predicate<Arc<Event<TYPES>>>>>,
) -> ExternalEventsExpectations<TYPES> {
ExternalEventsExpectations::from_outputs(predicates)
}
12 changes: 11 additions & 1 deletion crates/testing/src/script.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
use std::{sync::Arc, time::Duration};

use hotshot_task_impls::events::HotShotEvent;
use hotshot_types::traits::node_implementation::NodeType;
use hotshot_types::{event::Event, traits::node_implementation::NodeType};

use crate::predicates::{Predicate, PredicateResult};

Expand Down Expand Up @@ -66,6 +66,16 @@ impl<TYPES: NodeType, S> Expectations<TYPES, S> {
}
}

pub struct ExternalEventsExpectations<TYPES: NodeType> {
pub output_asserts: Vec<Box<dyn Predicate<Arc<Event<TYPES>>>>>,
}

impl<TYPES: NodeType> ExternalEventsExpectations<TYPES> {
pub fn from_outputs(output_asserts: Vec<Box<dyn Predicate<Arc<Event<TYPES>>>>>) -> Self {
Self { output_asserts }
}
}

pub fn panic_extra_output_in_script<S>(stage_number: usize, script_name: String, output: &S)
where
S: std::fmt::Debug,
Expand Down
3 changes: 3 additions & 0 deletions crates/testing/tests/tests_cx_hardening.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
mod tests_cx_hardening {
automod::dir!("tests/tests_cx_hardening");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think tests_cx_hardening would make a task namespace into a first-class feature of the codebase, muddying the circumstances for when we should change them. Instead, let's just add these under the existing da task tests and remove this special file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to just throw this into one of the test groups

}
Loading
Loading