-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About running COFFE with local optimization instead of global #40
Comments
You're right. I think in both cases the area that was calculated should be returned (it is put in a local variable area, but not returned). If you make a pull request with that change and test it gives reasonable results we can merge it in. Thanks for flagging this. |
Hi, After carefully looking at the whole tran_sizing.py code. I found additional issues regarding the cost function for area-delay trade-offs:
This won't be a problem for the "global" mode. If you look at the get_eval_area() function invoked within the cost_function() and follow the get_eval_area() definition: Line 237 in 09fb0d4
When the opt_type parameter is hard-coded as "global", and the is_ram_component and is_cc_component parameters are 0 as in the case for most logic tile components, the get_eval_area() function will return fpga_inst.area_dict["tile"] which is okay. But I think a better practice is that in the sizing code for each component as shown in the above three examples, the fpga_inst.sb_mux parameter should be replaced by the name of each component being sized. For example, in this example: Line 2363 in 09fb0d4
The code is trying to size the connection block mux, then it's better to put fpga_inst.cb_mux instead of fpga_inst.sb_mux. This allows the user to run COFFE in local mode without using the wrong area (which is always the area of fpga_inst.sb_mux in the current COFFE).
If you look at the get_current_delay() function: Line 406 in 09fb0d4
The second parameter is_ram_component specifies whether this component is a RAM component, this parameter should be 0 when sizing the above three examples, but COFFE assigns the cost_function twice with the second assignment using is_ram_component = 1. According to the get_current_delay() function, if is_ram_component = 1, the function will return the delay of RAM which gives a wrong cost. Lines 483 to 486 in 09fb0d4
After modifying the code with my understanding based on the above two points, I reran the test_top_level.py file to test whether I did something wrong. The tests passed for most components since I am running in the global mode. But for some carry chains and lut drivers, the tests failed with 5% ~ 15% error in delay. But I think this is because that the current COFFE uses a wrong current cost for these components as described in my second point (assigning current_cost twice). If you want to look at my modifications, you can refer to my branch here (only changed the tran_sizing.py file): https://github.com/yc2367/COFFE/tree/Yuzong_Test Please let me know if you prefer me to create a pull request for comparison. Thanks! |
Thanks for the detailed analysis! @sadegh68 @aman26kbm : any thoughts? These look like good things to fix, but getting your opinion on the code in question and what tests should be run would be great. |
Good observations, @yc2367. Thanks for the details. I agree that these are all bugs we should fix. Most likely they are copy-paste errors, and because we usually don't run the local mode. I went through the changes in your branch. I agree with all changes. There are a couple things I want to mention:
I do see you've added a new test file as well, which runs the existing 4 tests with the |
I agree with the changes. These bugs exist due to the fact that the local mode was not tested during parts of COFFE's development. A few tests focusing on the local mode should reduce the likelihood of such future bugs in the future. |
Hi @vaughnbetz, @aman26kbm and @sadegh68 , Thanks a lot for your reply.
Another question, I actually reran the original COFFE with tests_top_level.py. It also gave some delay errors for components like lut_driver, carry_chain etc. Could you help me clarify this point? I will try to add a reference file for the local mode soon. I think the benefit of local mode is that if some users want to develop additional components like computing in-memory blocks, e.g. an adder. They may want to reduce the area overhead by running the local mode just for that computing in-memory block, and this will be much faster since the rest of the sizing code can even be commented out when running the local mode for just one block. |
Aren't these failures expected? We are changing the code that affects the sizing of these components (lut driver and carry chain), even in the global mode. No? |
Thanks for the reply! I mean the original COFFE without my changes, which is the current COFFFE on this repository, also failed some tests using the current test reference file. |
Ah I see. Could be because of the HSPICE version? The original reference files may have been generated with an older version of HSPICE. |
Might be, I am using the 2022 version Hspice. But just want to clarify if you will have get some 5%~15% errors using yiur Hspice? |
I'm not sure of the reason. The changes are pretty small, thankfully. Adding @StephenMoreOSU in case he has any ideas. |
Hi, |
Sorry I won't have the time to help with the verification. |
Hi,
I am currently using coffe to run automatic sizing for a full adder. I only care about the area and delay trade-off for this subcircuit without considering the global mode. Hence, I specify "-o local" when running coffee. However, it continuously popped up an error saying that the area_list in tran_sizing.py, line 1386 was empty.
I looked into this issue and found that the reason was that the get_eval_area() function in tran_sizing.py, line 237 doesn't return anything when the opt_type is "local":
COFFE/coffe/tran_sizing.py
Lines 237 to 243 in 09fb0d4
And the same problem exists for the get_final_area() function in tran_sizing.py, line 254:
COFFE/coffe/tran_sizing.py
Lines 254 to 257 in 09fb0d4
Could you take a look at this?
The text was updated successfully, but these errors were encountered: