-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: Refactor the EAE (hardware multiplier and divider) #39
Comments
We might also want to include additional functions, e.g. square root. This can be done in around 7 clock cycles too. See this link for an implementation of the square root function: https://github.com/MJoergen/math/tree/master/fast_sqrt |
Here is an implementation of the division algorithm: https://github.com/MJoergen/nexys4ddr/blob/master/offloader/Episodes/ep10_-_Fact/math/divmod.vhd Note, all these optimized algorithms are expected to run at a much faster clock rate (> 200 MHz). This is possible because the bit shifting logic is very simple. As long as the fast clock is an integer multiple of the CPU clock, then there should be no Clock Domain Crossing issues. |
@MJoergen Sounds like a good idea. We could do this in the "dev-V1.61" branch of QNICE, which is the "MiSTer2MEGA65" branch when time is ripe. Instead of implementing a CPU_WAIT_FOR_DATA, we might also consider to use the https://github.com/sy2002/QNICE-FPGA/blob/dev-V1.61/vhdl/EAE.vhd#L108C27-L108C27 The math library already has code that waits for the calculation to finish as shown here, for example: https://github.com/sy2002/QNICE-FPGA/blob/master/monitor/math_library.asm#L19 We then would need to ensure, that our MiSTer2MEGA65 version of https://github.com/sy2002/QNICE-FPGA/blob/master/monitor/monitor.asm#L4 So by going down this route, the refactoring would be rather seamless to the QNICE Monitor code. |
Interesting alternative. I had indeed forgotten about that "busy" status :-) However, I still prefer the My reasoning is as follows: The "check for busy" code in Plus, the stalling will only ever occur during the read from the EAE. Notice line 24: https://github.com/sy2002/QNICE-FPGA/blob/master/monitor/math_library.asm#L24C1-L24C1 This is a "useful" instruction taking place simultaneously with the calculation. Therefore, the subsequent read from EAE in the next line will stall for only circa 4 clock cycles. |
The refactoring will ultimately reduce the complex logic tree generated.
The idea is to use simpler (and slightly slower) algorithms, but reduce complexity (and maybe even synthesis time).
Currently, the EAE uses approx 1339 LUTs and 410 slices. This can probably be reduced by a factor of 5 or more.
Currently, a calculation takes 3 clock cycles. After refactoring I estimate a calculation to take around 7 clock cycles.
To accommodate the slower calculation, a
CPU_WAIT_FOR_DATA
signal should be added.The text was updated successfully, but these errors were encountered: