Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to compute the "LocalRoutingWireLoad" #54

Open
narutozxp opened this issue Sep 18, 2023 · 3 comments
Open

How to compute the "LocalRoutingWireLoad" #54

narutozxp opened this issue Sep 18, 2023 · 3 comments

Comments

@narutozxp
Copy link

Hello Professor Betz. As the code descripted in

COFFE/coffe/fpga.py

Lines 2777 to 2783 in 4397424

def _compute_load(self, specs, local_mux):
""" Compute the load on a local routing wire (number of on/partial/off) """
# The first thing we are going to compute is how many local mux inputs are connected to a local routing wire
# This is a function of local_mux size, N, K, I and Ofb
num_local_routing_wires = specs.I+specs.N*specs.num_ble_local_outputs
self.mux_inputs_per_wire = local_mux.implemented_size*specs.N*specs.K/num_local_routing_wires

mux_inputs_per_wire = local_mux.implemented_size\*specs.N\*specs.K/num_local_routing_wires.

However, as there are specs.N*specs.K muxes in the local routing, mux_inputs_per_wire should be equal to specs.N*specs.K muxes. Could you tell me the reason why it relates to the local mux size? It puzzles me a lot of times.
Thanks!

@vaughnbetz
Copy link
Owner

This looks correct to me. I believe the calculation is saying you have a total number of mux inputs of that is equal to the number of inputs to each mux * the number of such muxes. Then take all those mux inputs and divide them evenly over the local routing wires to figure out how many mux inputs (loads) we have per wire.

Total number of muxes that are fed by local routing: specs.N * specs.K (i.e. this is the total number of LUT inputs)
Total number of inputs to these muxes: multiply by local_mux.implemented_size
Spread these mux inputs evenly over all the local routing wires (which are the possible sources/inputs): divide by num_local_routing_wires.

Adding @sadegh68 and @StephenMoreOSU in case they have anything to add.

@narutozxp
Copy link
Author

@vaughnbetz thank you for your kind answer. However, this will result in a heavier load for input wires, but will not improve routing.
For example, if we have the following architecture:

fpga_arch_params:

  arch_out_folder: ./output_files
  # The following parameters are the classic VPR architecture parameters
  N : 8
  K : 4
  W : 100
  L : 4
  I : 24
  Fs : 3
  Fcin : 0.15
  Fcout : 0.1

  # Number of BLE outputs to general routing 
  Or : 1

  # Number of BLE outputs to local routing
  Ofb : 1

  # Population of local routing MUXes 
  Fclocal : 1

According to the code of COFFE,
level2_size=int(math.sqrt(self.required_size))=int(sqrt(24+8))=5
level1_size=int(math.ceil(float(self.required_size)/self.level2_size))=7
local_mux.implemented_size = level2_size * level1_size = 35
self.mux_inputs_per_wire = local_mux.implemented_sizespecs.Nspecs.K/num_local_routing_wires = 35 * 4 * 8 / 32 = 35

However, I only need a 32-mux to implement my full-crossbar, but we use a 35-mux now. In my opinion, a 32-mux could select all of the local_routing_wires, so a 35-mux will not improve the ability of local routing(multiple inputs of mux connect the same local_routing_wire). On the contrary, 35-mux will make the wire load larger, increasing the area and delay. Even if I have to use a 35-mux(maybe for easy physical implementation), I will only use its 32 input of all 35 inputs, because, this way has a more minor delay(smaller wire load) with no effect to the flexibility of local routing.

@vaughnbetz
Copy link
Owner

Your logic seems correct. It will be a small difference, but you are right that rounding off the number of inputs to force a balanced 2-level mux will increase the number of loads a little. I don't think it will make a large difference, and it does keep the mux symmetric. You could update the code to change the computation for this case, but if you do so it should be commented so it's understandable. Adding @StephenMoreOSU in case he has any thoughts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants