Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DFG task granularity #57

Merged
merged 4 commits into from
Oct 18, 2024
Merged

DFG task granularity #57

merged 4 commits into from
Oct 18, 2024

Conversation

antoniupop
Copy link
Contributor

No description provided.

@antoniupop antoniupop force-pushed the antoniu/graph-partition branch 5 times, most recently from 92f603d to 410565d Compare October 10, 2024 11:05
@antoniupop antoniupop marked this pull request as ready for review October 10, 2024 11:05
Copy link
Contributor

@david-zk david-zk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments

MaxLocality,
}

impl std::fmt::Debug for ExecNode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this you can just add #[derive(Debug)] on ExecNode and you'd also add these annotations to dependent structures like NodeIndex

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the debug formatting would not be very helpful if I did that - not sure if maybe I misunderstand your comment.

fhevm-engine/executor/src/dfg/scheduler.rs Outdated Show resolved Hide resolved
fhevm-engine/executor/src/dfg/scheduler.rs Outdated Show resolved Hide resolved
fhevm-engine/executor/src/server.rs Outdated Show resolved Hide resolved
.build()
.unwrap();
rayon_pool.broadcast(|_| {
set_server_key(keys.server_key.clone());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rayon pool could be initialized before tokio runtime to do rayon pool key initialization once? Usually runtimes have lightweight handles to them so they could be cloned if needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rayon thread pool doesn't seem to support cloning. This is a localised rayon thread pool to avoid interference from operations in other tokio blocking threads. This is very lightweight too since it's in an Arc.

rayon_pool.broadcast(|_| {
set_server_key(keys.server_key.clone());
});
THREAD_POOL.set(Some(rayon_pool));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably use here cloned version of the runtime

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - we can't clone. If we don't use tokio threads, the global rayon thread pool is used otherwise.

… execution capability

The task scheduling is controlled by the FHEVM_DF_SCHEDULE environment
variable with available policies:
 - MAX_PARALLELISM: preserves parallelism but aggregates dependent tasks to take advantage
   of spatial and temporal cache locality.
 - MAX_LOCALITY: aggregates all connected components in the DFG to improve cache locality.
 - FINE_GRAIN: disable task aggregation and executes each FHE operation based on DFG dependences.
@antoniupop antoniupop force-pushed the antoniu/graph-partition branch from 46703d9 to 50f949e Compare October 17, 2024 12:35
@antoniupop antoniupop requested a review from david-zk October 17, 2024 13:10
This will restrict the rayon thread pool exposed to TFHE-rs in order
to improve the scheduling of concurrent FHE operations. Each re-usable
thread pool is used exclusively for one scheduler task, which improves
cache locality and reduces interference from other operations.
@antoniupop antoniupop force-pushed the antoniu/graph-partition branch from 50f949e to 695f79a Compare October 18, 2024 07:28
@antoniupop antoniupop merged commit 3f6f747 into main Oct 18, 2024
2 checks passed
@antoniupop antoniupop deleted the antoniu/graph-partition branch October 18, 2024 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants