-
Notifications
You must be signed in to change notification settings - Fork 395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Router Lookahead Profiler #2683
base: master
Are you sure you want to change the base?
[WIP] Router Lookahead Profiler #2683
Conversation
d1025c9
to
c86ff7c
Compare
25cbc7f
to
0826ad7
Compare
@soheilshahrouz was also interested in reviewing this code. We'll eventually need a QoR run to show no slowdown (good to run both VTR and Titan suites and attach spreadsheets). |
e939ea2
to
cbe1334
Compare
vpr/src/base/read_options.cpp
Outdated
.help( | ||
"For every routed sink, records the cost, delay, and congestion estimated by the router lookahead and the " | ||
"actual cost, delay, and congestion, from every node along each route to the sink. These results, along with many " | ||
"other attributes of the node, are recorded into lookahead_verifier_info.csv.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the user should be able to specify the file name.
vpr/src/route/router_lookahead.cpp
Outdated
@@ -109,7 +115,7 @@ float NoOpLookahead::get_expected_cost(RRNodeId /*current_node*/, RRNodeId /*tar | |||
return 0.; | |||
} | |||
|
|||
std::pair<float, float> NoOpLookahead::get_expected_delay_and_cong(RRNodeId /*node*/, RRNodeId /*target_node*/, const t_conn_cost_params& /*params*/, float /*R_upstream*/) const { | |||
std::pair<float, float> NoOpLookahead::get_expected_delay_and_cong(RRNodeId, RRNodeId, const t_conn_cost_params&, float) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you remove argument name comments?
RRNodeId to_node, | ||
const t_conn_cost_params& params, | ||
float R_upstream) const override; | ||
std::pair<float, float> get_expected_delay_and_cong(RRNodeId from_node, RRNodeId to_node, const t_conn_cost_params& params, float R_upstream) const override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
long line
@@ -181,7 +181,7 @@ float ExtendedMapLookahead::get_chan_ipin_delays(RRNodeId to_node) const { | |||
// | |||
// The from_node can be of one of the following types: CHANX, CHANY, SOURCE, OPIN | |||
// The to_node is always a SINK | |||
std::pair<float, float> ExtendedMapLookahead::get_expected_delay_and_cong(RRNodeId from_node, RRNodeId to_node, const t_conn_cost_params& params, float /*R_upstream*/) const { | |||
std::pair<float, float> ExtendedMapLookahead::get_expected_delay_and_cong(RRNodeId from_node, RRNodeId to_node, const t_conn_cost_params& params, float) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was the argument name comment removed?
float expected_delay_cost = params.criticality * expected_delay; | ||
float expected_cong_cost = (1.0 - params.criticality) * expected_congestion; | ||
float expected_delay_cost = expected_delay * params.criticality; | ||
float expected_cong_cost = expected_congestion * (1. - params.criticality); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you need to change the order?
@vaughnbetz @AlexandreSinger Python lint is failing because some of the libraries I used can't be imported; is there any way around this? |
It can be probably fixed by updating requirements.txt file |
d5dc37d
to
3b9615d
Compare
ebdc050
to
1168090
Compare
d7f1068
to
2db5325
Compare
@soheilshahrouz @vaughnbetz Some regtests are having some QoR failures. On However, on |
I'm not too worried about dsip; it is small. The long route times on wb_conmax are also probably OK if there are just a few of these. Adding @amin1377 since those 3D and flat routing ones are probably his tests. The more I think about it the more I think we should #ifdef adding data members to the heap though; I am really not keen on carrying around extra data in a big and hot data structure for profiling. I think it would be better to have no impact on the default flow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good; just have some commenting suggestions. We'll also need a QoR run on VTR & Titan to check cpu time & quality.
vpr/src/base/read_options.cpp
Outdated
.help( | ||
"For every routed sink, record the cost, delay, and congestion estimated by the router lookahead and the " | ||
"actual cost, delay, and congestion, from every node along each route to the sink. These results, along with many " | ||
"other attributes of the node, are recorded into the file name provided. File extension must be .csv.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add "to assist in debugging or validating the router lookahead."
@@ -1400,6 +1400,8 @@ struct t_router_opts { | |||
std::string write_intra_cluster_router_lookahead; | |||
std::string read_intra_cluster_router_lookahead; | |||
|
|||
std::string lookahead_profiling_output; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add doxygen comment for this (can use ///).
If it's an empty string does it not get printed? (If so, say that).
*/ | ||
struct t_rr_node_route_inf { | ||
RREdgeId prev_edge; | ||
|
||
float acc_cost; | ||
float path_cost; | ||
float backward_path_cost; | ||
#ifdef PROFILE_LOOKAHEAD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add a brief comment saying why this extra data is needed for PROFILE_LOOKAHEAD, and that we conditionally compile it because this is a hot and large data structure.
@@ -441,7 +441,7 @@ static void add_delay_to_matrix( | |||
} | |||
|
|||
static void generic_compute_matrix_dijkstra_expansion( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a doxygen comment for this, detailing what you know about the function and its arguments (and for sure what the route_profiler does).
vpr/src/route/connection_router.cpp
Outdated
@@ -239,7 +239,21 @@ t_heap* ConnectionRouter<Heap>::timing_driven_route_connection_from_heap(RRNodeI | |||
// This is then placed into the traceback so that the correct path is returned | |||
// TODO: This can be eliminated by modifying the actual traceback function in route_timing | |||
if (rcv_path_manager.is_enabled()) { | |||
rcv_path_manager.insert_backwards_path_into_traceback(cheapest->path_data, cheapest->cost, cheapest->backward_path_cost, route_ctx); | |||
#ifdef PROFILE_LOOKAHEAD | |||
rcv_path_manager.insert_backwards_path_into_traceback(cheapest->path_data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is PROFILE_LOOKAHEAD tested with rcv? If not, we could just choose to not support PROFILE_LOOKAHEAD with RCV to simplify the code a little.
@@ -53,7 +58,8 @@ class RouterDelayProfiler { | |||
|
|||
vtr::vector<RRNodeId, float> calculate_all_path_delays_from_rr_node(RRNodeId src_rr_node, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you know enough about this routine to write a doxygen comment it would be good to write one.
// pass in an iteration value, which is the only context in which we want to profile. | ||
if (iteration < 1) | ||
return; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function should have a comment giving an overview of how it works, unless that's already covered in the (to be added) lookahead_profiler.cpp file-scope comment that gives the high-level context.
vpr/src/route/lookahead_profiler.h
Outdated
* | ||
* @param iteration The router iteration. | ||
* @param target_net_pin_index Target pin of this sink in the net. | ||
* @param cost_params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest briefly commenting each input's purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need a high-level comment of how the lookahead profiler works somewhere (file level, or maybe on this routine).
vpr/src/route/router_lookahead.h
Outdated
@@ -123,7 +169,7 @@ const RouterLookahead* get_cached_router_lookahead(const t_det_routing_arch& det | |||
class ClassicLookahead : public RouterLookahead { | |||
public: | |||
float get_expected_cost(RRNodeId node, RRNodeId target_node, const t_conn_cost_params& params, float R_upstream) const override; | |||
std::pair<float, float> get_expected_delay_and_cong(RRNodeId node, RRNodeId target_node, const t_conn_cost_params& params, float R_upstream) const override; | |||
std::pair<float, float> get_expected_delay_and_cong_ignore_criticality(RRNodeId node, RRNodeId target_node, const t_conn_cost_params& params, float R_upstream) const override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long line; maybe break it.
vpr/src/route/router_lookahead.h
Outdated
/** | ||
* @brief Get expected (delay, congestion) from node to target_node. | ||
* | ||
* @attention Either compute or read methods must be invoked before invoking get_expected_delay_and_cong_ignore_criticality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
must be invoked --> must be invoked to set up the lookahead data structures
Sorry for the delay with the QoR results; for some reason, they have degraded with this change. I'm trying to figure out why; if/when I can, I'll post the results in the description. |
Thanks. If you can post what the current QoR is that would be helpful too; even if it is not that good it helps show what the status is. |
There is a new class
LookaheadProfiler
which records a variety of data that can be used to evaluate the quality of the router lookahead.Description
Using the branch reconstructed in
RouteTree::add_subtree_from_heap()
,LookaheadProfiler
records the following data into a.csv
file for every node in the branch:The profiler is enabled by specifying
CMAKE_PARAMS="DVPR_PROFILE_LOOKAHEAD=on"
when building VPR, and also using the command-line option--profile_router_lookahead [output_file_name.csv]
.This data can be analyzed using the script
vtr_flow/scripts/profiling_utils/parse_lookahead_data.py
, which produces a data summary file and a number of scatter plots, bar graphs, pie charts, and heatmaps. (TBD: Currently, using-h
gives detailed instructions, but a README will be added as well.)Motivation and Context
This was used to verify issues that led to #2639. It can be used to identify current/future issues with the lookahead.
How Has This Been Tested?
There is a new CI test that ensures that VPR builds properly with
PROFILE_LOOKAHEAD
defined.Currently, even with
PROFILE_LOOKAHEAD
undefined, there is a degradation in QoR. These results are averages over 3 runs each:vtr
(old)vtr
(new)titan
(old)titan
(new)titan_large
* (old)titan_large
* (new)For some reason, there is a 1.6% wirelength increase on
titan
(andtitan_large
) circuits and a 2.4% CPD increase ontitan_large
circuits.*10 largest
titan
circuits.Types of changes
Checklist: