Zfh #13

YunhaoLan · 2023-09-21T22:13:08Z

Initial commit of half-precision FPU. This is a copy of stage3 branch.

Pulling SPARCE into master as basis for further development.

external halt signal. Tests all pass IF: 1. RESET_PC is changed to 0x8000_0000 2. In the sparce SASA Table, the constraint on the skipping (pc[31:18] == '0) is removed to allow PC 0x8000_0000+ to skip.

Merging Enes' priv unit updates. Works for all existing ASM tests (made to work with it).

Updated AHB Master with support for waited transfers

Implementation of a 3-stage pipelined multiplier and a radix-4 restoring divider. Passes all RV32M tests, and are synthesizable. Co-authored-by: Jing Yin See <[email protected]> Co-authored-by: Yuqing Fan

* adjustments to priv control to account for interrupt mie issues * Added commentary on the PRIV_CONTROL fix * Fixing prior commit -- added the wrong files. Reverting changes & adding correct files * Fixed memory controller bug where interrupts caused FSM to lock up. Passes all RVB tests, and works in AFTx06 integration. * Added halt on infinite loop as microarch paramter, fixed minor memory controller bug * Fixed priv unit issue where mepc was captured during interrupt handler Co-authored-by: Christopher Chiminski <[email protected]>

risc_mgmt_extension_uvm

Initial implementation of RV32C. Contains instruction decompression logic and fetch buffer to allow for free mixing of 16- and 32-bit instructions. Also includes self-tests for all currently-implemented RV32C instructions and updates to config_core.py to allow RV32C to be enabled/disabled. Author: Jing Yin See

* Initialize RV32E and DEV branch * update .gitignore * add RV32E support * Reverted changes to TESTNUM Co-authored-by: Jiahao Xu (socet94) <[email protected]> Co-authored-by: Cole Nelson <[email protected]>

* WFI detection for clock_manager Co-authored-by: Project47 <Raghuraman Kannan> Co-authored-by: Cole Nelson <[email protected]>

@ngildenhuys

Initial support for Verilator in RISCVBusiness. Adds Makefile supporting Verilator along wtih small fixes to ensure compilation. Adds an updated "run_tests_verilator" based on @ngildenhuys improved run_tests script. Co-authored-by: Mitch <[email protected]> Co-authored-by: Hadi Ahmed <[email protected]>

Occasionally, when RV32C is able to provide an instruction early (i.e. buffered compressed instruction), execution could be skipped due to some validity flags relying on waiting for I-fetch. This provides a small fix to the hazard unit that detects the "early finish" condition and allows the updates the same as if instruction fetch had completed.

Implementation of complete M-Mode Priv 1.12 spec. Co-authored-by: Hadi Ahmed Project17 SOC <[email protected]> Co-authored-by: Cole Nelson <[email protected]>

* Initial support for FuseSoC in RISCVBusiness Co-authored-by: Mitch <[email protected]> Co-authored-by: Hadi Ahmed <[email protected]>

RISC-V PMA implementation for Privileged Spec 1.12.

Implementation of Priv 1.12 PMP

Implementation of User Mode, Priv 1.12

For implementing pipelines with >2 stages, it is necessary for rd to be supplied by a later stage instead of the current instruction. This changes the control unit to output its own rd signal instead of feeding the register file directly. For tspp, the control unit's rd is assigned directly to the register file's rd, but for stage3 it will be passed to the next stage, and the next stage's rd will be fed to the register file for performing writeback.

This adds a 3-stage pipeline, defined as follows: Fetch | Decode/Execute | Memory/Writeback There is a forwarding path from the second latch into the decode/execute stage. However, memory references cannot be forwarded due to critical path issues, so load->use and csr->use hazards will incur 1 extra cycle of stalling (beyond usual delay of load) for dependent instructions. Some rough estimates of frequency (standalone) are: - 70MHz max on FPGA - 200MHz+ feasible in ASIC This implies that 50MHz FPGA speed and 100MHz ASIC speed should be attainable, with more room to push the latter depending on the rest of the system. All the RV32I tests pass. Remaining TODOs are: - Alter the TB to allow testing of forwarding. Currently, memory references take so long that there are never back-to-back instructions in the pipeline. While this is somewhat realistic (AFT will have these delays), the addition of a cache (and prefetching effect of RV32C) will allow single-cycle instruction hits. - Implement exceptions/interrupts. This is currently untested. - Allow RV32C. Theoretically, this shouldn't cause any problems, since the parts of the pipeline that touch the decompressor were not altered. - RV32M. There needs to be a decision about whether RISC-MGMT should be supported for this pipeline. Ideally it should be added, but this is also a chance to remove the standard extensions from RISC-MGMT, and change it to be only for custom instructions if that is desired.

This commit adds debug-only signals to the pipeline for full CPU tracker support. Additionally allows the TB to dump the waveform on a CTL-C (SIGINT) instead of leaving a corrupted waveform trace for better debugging of infinite loops.

All existing synchronous exception tests pass (ecall, pma). Additionally, 2 more tests for illegal instructions and pma i-fetch faults were added to test exceptions originating in all 3 stages, which also pass.

This changes the logic for stalling/flushing so that on receiving an interrupt, the currently-executing memory instruction is allowed to finish (if such an instruction exists), then the oldest PC in the pipeline is taken. Since this is asynchronous, there is no guarantee the M-stage has a valid PC, so priority logic is needed to select from the 3 PCs in the pipeline. This is simpler than latching the next PC of the last valid instruction since it doesn't require instruction-specific knowledge (e.g. control flow target, compressed, etc.). Signals were added to the pipeline to track when an instruction is valid as well, since insn == 0 cannot be assumed to be a pipeline bubble instead of a nop instruction.

Adds minor fixes for stalling logic to allow RV32C to work. Includes fixes to forwarding logic that were encountered due to back-to-back execution being possible with RV32C buffer. All RV32I and RV32C tests pass with RV32C enabled and compiling with compression, with the exception of RV32I fence.i, jal, jalr. These are expected failures as they all implicitly assume instructions are aligned to 4 instead of 2 or 4.

This adds a second version of RISCVBusiness for testing the core that does not have a memory controller. This allows direct access to the buses for testing with different latencies. Currently, it just does immediate latencies to mimic having perfect caches. Additional work would be to add the ability to simulate hits/misses by having a random chance of getting bad latency, independently for I/D streams.

This fixes bugs related to hazards occurring only in the case of back-to-back execution of instructions. The fixes were: - Fixing forwarding unit assignments to allow detection of forwarding conditions correctly - Forcing ifence to flush the pipeline and re-fetch in-flight instructions that may no longer be valid - Fix RV32C to obey the pipeline control signals. Previously, it ignored the "pc_en" signal, which led to cases where instructions would be skipped if the first pipeline latch was stalled while RV32C wanted to advance. Remaining items to test/implement: - Flushing for selected CSR writes. Need a list of such instructions, but at minimum this should include the PMP/PMA configuration registers. - Testing variable latencies, instead of only fixed slow/fast latencies

This commit adds RV32M to the stage3 pipeline. This does not use RISC-MGMT, instead opting for a wrapper with enable/disable like RV32C. In discussions with the team, this seems more manageable than trying to fit more complex extensions (that include state) into RISC-MGMT. In the future, RISC-MGMT should be integrated to allow custom instructions. All tests for RV32IMC pass.

Fixes a bug where I-fetch after a PC redirect could read the wrong instruction if the prior in-progress request became ready after the PC changed. Changes are: - Suppress iren whenever PC is redirected - Do not sample EPC from mem stage on interrupt (fixes repeated load/store instruction to non-idempotent region, but still permits load/store faults) - Expand memory controller ability to abort transactions when iren is suppressed

This adds an extra state to the APB requester module to permit correct handling of back-to-back transactions. The new request state takes the same latched signals as the data state, so spurious input changes cannot break the request.

Fixes an issue where misaligned addresses can appear on APB, causing completers to signal an error. Fix forces address alignment by tying lower bits to '0', and relying on strobe for writes.

This commit fixes up the SystemVerilog self-test testbench, and changes the simulated ram model to use binary files to match the behavior of the Verilator testbench. Only basic testing has been done.

Connect some unconnected signals (masked by Verilator 'x' handling) and fix some parsing differences between Verilator and Xcelium/Modelsim.

The signal "valid_e" was unassigned, causing an unknown value in simulation with Xcelium

L1 Cache integration --------- Co-authored-by: Jimmy <[email protected]> Co-authored-by: Cole Nelson <[email protected]>

This fixes a bug where high-latency operations being cancelled could cause incorrect execution. The strategy is to hold the memory controller in a state where all buses are "busy" until all outstanding bus requests complete.

* Priv Unit: Fix handling of PMA/PMP faults * generic_bus_if: Add "error" signal, propagate * Bus Fault: Fix I-fault case * L1: Make pass_through respect wen/ren, revert cache state when request deaserted.

1. Label generate blocks 2. Fix inferred latch in priv unit (typo)

cole-nelson and others added 30 commits October 8, 2020 20:33

Sparce isa integration (#1)

5c9f310

Pulling SPARCE into master as basis for further development.

Changed TBs to use internal halt signal due to removal of

a432ac2

external halt signal. Tests all pass IF: 1. RESET_PC is changed to 0x8000_0000 2. In the sparce SASA Table, the constraint on the skipping (pc[31:18] == '0) is removed to allow PC 0x8000_0000+ to skip.

Interrupt integration (#2)

03979a5

Merging Enes' priv unit updates. Works for all existing ASM tests (made to work with it).

Bus team (#3)

8a8a5bf

Updated AHB Master with support for waited transfers

Mul div fix (#4)

47abfb4

Implementation of a 3-stage pipelined multiplier and a radix-4 restoring divider. Passes all RV32M tests, and are synthesizable. Co-authored-by: Jing Yin See <[email protected]> Co-authored-by: Yuqing Fan

Add files via upload

eb9a279

Merge pull request #7 from Purdue-SoCET/risc_mgmt_extension_uvm

9e2405b

risc_mgmt_extension_uvm

Add files via upload

712b300

add RV32E support. (#9)

f4fa7f3

* Initialize RV32E and DEV branch * update .gitignore * add RV32E support * Reverted changes to TESTNUM Co-authored-by: Jiahao Xu (socet94) <[email protected]> Co-authored-by: Cole Nelson <[email protected]>

WFI detection for clock manager (#10)

1e9e72a

* WFI detection for clock_manager Co-authored-by: Project47 <Raghuraman Kannan> Co-authored-by: Cole Nelson <[email protected]>

Privileged Unit 1.12 CSR Update (#17)

8e96b87

Implementation of complete M-Mode Priv 1.12 spec. Co-authored-by: Hadi Ahmed Project17 SOC <[email protected]> Co-authored-by: Cole Nelson <[email protected]>

Fusesoc (#20)

1a203c8

* Initial support for FuseSoC in RISCVBusiness Co-authored-by: Mitch <[email protected]> Co-authored-by: Hadi Ahmed <[email protected]>

Priv 1.12 PMA integration (#21)

9f09e77

RISC-V PMA implementation for Privileged Spec 1.12.

v1.12 PMP Integration (#22)

03e4fb5

Implementation of Priv 1.12 PMP

User Mode, v1.12 implementation (#23)

8682516

Implementation of User Mode, Priv 1.12

stage3: Add full CPU tracker support

f0d37cc

This commit adds debug-only signals to the pipeline for full CPU tracker support. Additionally allows the TB to dump the waveform on a CTL-C (SIGINT) instead of leaving a corrupted waveform trace for better debugging of infinite loops.

stage3: exceptions functional

f43b408

All existing synchronous exception tests pass (ecall, pma). Additionally, 2 more tests for illegal instructions and pma i-fetch faults were added to test exceptions originating in all 3 stages, which also pass.

Bugfix: adding .core and TB file for the no_memory version of the TB

f141649

stage3: Fixes to allow synthesis under Quartus + Genus

7ec493d

hadiahmed098 and others added 18 commits December 11, 2022 16:00

Added new privileged unit changes to core

cc03b4f

Minor PMA fixes

cdd1e69

APB: Fix back-to-back transactions

22a355a

This adds an extra state to the APB requester module to permit correct handling of back-to-back transactions. The new request state takes the same latched signals as the data state, so spurious input changes cannot break the request.

apb: Fix alignment

7d63c0f

Fixes an issue where misaligned addresses can appear on APB, causing completers to signal an error. Fix forces address alignment by tying lower bits to '0', and relying on strobe for writes.

ram_sim_model: Change to support loading binary data instead of hex

4153d13

This commit fixes up the SystemVerilog self-test testbench, and changes the simulated ram model to use binary files to match the behavior of the Verilator testbench. Only basic testing has been done.

stage3: Fix bugs preventing runs with Xcelium/Modelsim

2be5971

Connect some unconnected signals (masked by Verilator 'x' handling) and fix some parsing differences between Verilator and Xcelium/Modelsim.

stage3: Fix bug where MEPC gets unknown value

435c231

The signal "valid_e" was unassigned, causing an unknown value in simulation with Xcelium

bus_bridges: Fixup AHB bridge for AFTx07

2c8f46c

l1 caches integration (#24)

f9c4eb9

L1 Cache integration --------- Co-authored-by: Jimmy <[email protected]> Co-authored-by: Cole Nelson <[email protected]>

memory_controller: Fix high-latency AHB bug

aebf325

This fixes a bug where high-latency operations being cancelled could cause incorrect execution. The strategy is to hold the memory controller in a state where all buses are "busy" until all outstanding bus requests complete.

CPU Tracker: Remove "display", add WFI support

16549ef

Bus fault handler (#26)

af8ee22

* Priv Unit: Fix handling of PMA/PMP faults * generic_bus_if: Add "error" signal, propagate * Bus Fault: Fix I-fault case * L1: Make pass_through respect wen/ren, revert cache state when request deaserted.

stage3: Fix HDL issues in Quartus

2b1a48d

1. Label generate blocks 2. Fix inferred latch in priv unit (typo)

fix: Incorrect config integer comparison form (#27)

641d74d

build: Disable verilator error for ENUMVALUE (#28)

f86bd96

Zfh initial update for addition and multiplication

2f0807b

Zfh initial update for addition and multiplication

a6bdb1f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zfh #13

Zfh #13

YunhaoLan commented Sep 21, 2023

Zfh #13

Are you sure you want to change the base?

Zfh #13

Conversation

YunhaoLan commented Sep 21, 2023