Simplify README.md and migrate to online/Sphinx docs.

cr1901 · Nov 26, 2024 · 291e268 · 291e268
1 parent 1241826
commit 291e268
Show file tree

Hide file tree

Showing 10 changed files with 569 additions and 326 deletions.
diff --git a/README.md b/README.md
diff --git a/doc/changes.md b/doc/changes.md
@@ -0,0 +1,2 @@
+```{include} ../CHANGELOG.md
+```
diff --git a/doc/development/internals.md b/doc/development/internals.md
@@ -107,3 +107,146 @@ using the components from the {py:mod}`~sentinel.align` module.
 
 Aside from aligning, the glue logic for latching addresses, read data, and
 write data is minimal and controlled directly by microcode signals. -->
+
+## Instruction Cycle Counts
+
+```{todo}
+I need to create a test that gets latency and throughput for each instruction
+type of the core.
+```
+
+The following counts are general observations (as of 11/18/2023), from
+examining the microcode (knowing that each microcode instruction always takes
+1 clock cycle):
+
+* _There is room for improvement, even without making the core bigger._
+* Fetch/Decode takes a _minimum_ of two cycles thanks to Wishbone classic's
+  REQ/ACK handshake taking two cycles.
+  * When Wishbone ACK is asserted, Decode is taking place.
+  * The GP file is a synchronous single read port, single write port. Sentinel
+    loads RS1 out of the register file during Decode.
+* All instructions share the same operation the cycle after ACK/Decode:
+  * Check for exceptions/interrupts, go to exception handler if so.
+  * Latch RS1 into the ALU.
+  * Load RS2 out of the register file, in anticipation for a "simple"
+    instruction.
+  * Jump to the instruction-specific microcode block.
+* At minimum, an instruction (`addi`, `or`, etc) takes 3 cycles to retire
+  after the initial shared cycles. This means Sentinel instructions have a
+  minimum latency of 6 cycles per instruction (CPI).
+* Sentinel instructions have a maximum throughput of 4 CPI by overlapping the
+  2 Fetch/Decode cycles of the _next_ instruction after the initial 3 shared
+  cycles of the _current_ instruction when possible ("pipelining").
+  * Some instructions overlap one of the Fetch/Decode cycles, some don't
+    overlap either of them. In particular, shift instructions with a nonzero
+    shift count don't pipeline Fetch/Decode. It may be possible to _always_ 
+    overlap at least one cycle, but I haven't tweaked the core yet to ensure
+    this is a sound optimization.
+* _Shift instructions need work_:
+  * For a shift of zero, shift-immediate latency is 10 CPI, throughtput 9 CPI.
+    Shift-register latency is 11 CPI, throughput 10 CPI.
+  * For a shift of nonzero `n`, shift-immediate _and_ shift-register latency
+    and throughput is 7 + 2*`n` CPI.
+* Branch-not-taken latency and throughput is 7 CPI. Branch-taken latency and
+  throughput is 8 CPI.
+* JAL/JALR latency is 9 CPI, throughput is 7 CPI.
+* Store latency and throughput is 8 CPI minimum. 2 cycles minimum are spent
+  waiting for Wishbone ACK.
+  * The core will not release STB/CYC between the store and fetch of the next
+    instruction.
+* Load latency is 10 CPI minimum, and throughput is 9 CPI. 2 cycles minimum
+  are spent waiting for Wishbone ACK.
+  * The core _will_ release STB/CYC before fetch of the next instruction.
+* CSR instructions require an extra Decode cycle compared to all other
+  instructions (to check for legality).
+  * At minimum, a read of a read-only zero CSR register has a latency of 7 CPI,
+    and a throughput of 6 CPI.
+  * At maximum, `csrrc` has a latency of 11 CPI, and a throughput of 10 CPI.
+* Entering an exception handler requires 5 clocks from the cycle at which
+  the exception condition is detected.
+  * `mret` has a latency and throughput of 8 CPI. 
+
+## CSRs
+
+Sentinel physically implements the following CSRs:
+
+* `mscratch`
+* `mcause`
+  * The core can only physically trigger a subset of defined exceptions:
+    * Machine external interrupt
+    * Instruction access misaligned
+    * Illegal instruction
+    * Breakpoint
+    * Load address misaligned
+    * Store address misaligned
+    * Environment call from M-mode
+
+    In particular worth noting:
+    * _Misaligned accesses are not implemented in hardware._
+    * There is no machine timer (a 64-bit counter is a bit too much to
+      ask for right now :(...).
+* `mip`
+  * Only the `MEIP` bit is implemented. The RISC-V Privileged Spec says:
+
+    > `MEIP` is read-only in `mip`, and is set and cleared by a
+    > platform-specific interrupt controller.
+
+    The user must provide their own interrupt controller. See
+    {class}`sentinel.top.Top`. One simple implementation is to `OR` all
+    external interrupt sources together, and query each peripheral when `MEIP`
+    is pending to find which peripherals need attention. This is implemented
+    for the serial and timer peripherals in the `examples/attosoc.py` example.
+
+    ```{note}
+    In the future, I may implement the high (platform-specific) 16-bits of
+    `mip`/`mie` to make interrupt-handling quicker.
+    ```
+* `mie`
+  * Only the `MEIE` bit is implemented.
+* `mstatus`
+  * Only the `MPP`, `MPIE`, and `MIE` bits are implemented.
+* `mtvec`
+  * The `BASE` is writeable; only the Direct `MODE` setting is implemented.
+  
+    ```{todo}
+    A read-only `BASE` is allowed, but I believe the Rust [support code](./support-code.md)
+    assumes a writable `BASE`. I don't wish to fork [`riscv-rt`](https://github.com/rust-embedded/riscv/tree/master/riscv-rt)
+    solely for a read-only `BASE`. So I deal with the potential loss of space
+    savings for now.
+
+    Revisit whether read-only `BASE` is feasible in the future.
+    ```
+* `mepc`
+
+The following CSRs are implemented as read-only zero and trigger an exception
+on an attempt to write:
+
+* `mvendorid`
+* `marchid`
+* `mimpid`
+* `mhartid`
+* `mconfigptr`
+
+The following CSRs are implemented as read-only zero (no exception on write):
+
+* `misa`
+* `mstatush`
+* `mcountinhibit`
+* `mtval`
+* `mcycle`
+* `minstret`
+* `mhpmcounter3-31`
+* `mhpmevent3-31`
+
+All remaining machine-mode CSRs are unimplemented and trigger an exception on
+_any_ access:
+
+* `medeleg`
+* `mideleg`
+* `mcounteren`
+* `mtinst`
+* `mtval2`
+* `menvcfg`
+* `menvcfgh`
+* `mseccfg`
+* `mseccfgh`
diff --git a/doc/development/overview.md b/doc/development/overview.md
@@ -21,7 +21,6 @@ PDM and DoIt complement each other:
 I leverage both PDM and DoIt for increased flexibility to run tests, benchmarks,
 and generate examples.
 
-(pdm-scripts)=
 ## PDM Scripts
 
 Scripts defined in `[tool.pdm.scripts]` are the main
@@ -73,15 +72,18 @@ scripts must pass as part of CI:
 * `gen`, `demo`, `demo-rust`: Check that code/demo generation works.
 * `test-quick`, `rvformal-all`, and `riscof-all`: Run tests.
 
-
+```{todo}
+Although [ReadTheDocs](https://sentinel-cpu.readthedocs.io/en/latest/) handles
+building docs at present, CI release should be gated on building docs
+successfully.
+```
 
 If necessary, the above PDM scripts invoke `doit`, which reads the `dodo.py`
 file to find out how to do the actual work.[^1]
 
-(doit-tasks)=
 ## DoIt Tasks
 
-`doit` tasks are "low-level" tasks wrapped by the PDM scripts {ref}`above <pdm-scripts>`.
+`doit` tasks are "low-level" tasks wrapped by the PDM scripts [above](#pdm-scripts)
 _They should be treated as a private and subject to change._ Howevever, I provide
 a `doit` PDM script to call `doit` directly if necessary. For instance, to
 list available `doit` tasks (_including [private tasks](https://pydoit.org/tasks.html#private-hidden-tasks)_),
@@ -116,6 +118,19 @@ run_sby                  run symbiyosys flow on Sentinel, "doit list --all run_s
 ucode                    assemble microcode and copy non-bin artifacts to root
 ```
 
+I've documented the `doit` tasks (but not subtasks) as a courtesy, and to make sure
+developers/users don't get stuck. That said, **prefer running `pdm` as a
+wrapper to `doit` rather than running `doit` directly.** 
+
+```{note}
+I make heavy use of [DoIt subtasks](https://pydoit.org/tasks.html#sub-tasks).
+These are understandably excluded from default help output. Run
+`pdm doit list --all [task]` to see all sub-tasks for a higher-level task.
+
+For instance, the `run_sby:reg_ch0` subtask runs the [Register Check](https://github.com/YosysHQ/riscv-formal/blob/main/docs/procedure.md#register-checks)
+for `riscv-formal`.
+```
+
 ```{todo}
 Decide whether to commit to treat all `doit` tasks as private/hidden or expose a set
 to a user/developer (no leading underscore). Right now, I am not being consistent.

diff --git a/doc/development/support-code.md b/doc/development/support-code.md
@@ -0,0 +1,33 @@
+# Support Code
+
+`sentinel-rt` is an _currently-empty_ Rust support crate. If necessary, it will
+contain support routines optimized for the Sentinel RISC-V _implementation_
+that LLVM or Rust wouldn't generally know about. I have three potential use
+cases:
+
+1. Wrappers over custom opcodes[^1] and the slow shift operators :),
+2. Related to 1., [`compiler-builtins`](https://github.com/rust-lang/compiler-builtins)
+   specialization if possible[^2].
+3. Runtime/Machine Mode code that is incompatible with the existing
+   [`riscv-rt`](https://github.com/rust-embedded/riscv/tree/master/riscv-rt),
+   _but_ compatible with the RISC-V spec[^3].
+
+However, at present, I don't need any special support code, so `sentinel-rt`
+is just a reserved crate with example code for demo bitstreams.
+
+```{note}
+If I expand demos such that multiple linker scripts are required, the examples
+_directory_ will become an examples crate. Do not depend on
+`sentinel-rt/examples` being stable; the source root is _already_
+a [workspace](https://doc.rust-lang.org/book/ch14-03-cargo-workspaces.html)!
+```
+
+## Footnotes
+
+[^1]: A `memcpy` instruction is a good candidate for custom microcoded instruction
+   with speedups.
+[^2]: Not clear to me that this _is_ possible! Just something I thought of.
+[^3]: The big one here is that RISC-V permits hardcoding `MTVEC`, but last I
+      checked, this was not supported. This would likely be a size win, but
+      I don't want to create a fork of `riscv-rt` just for this one edge case,
+      so I deal.
diff --git a/doc/development/testing.md b/doc/development/testing.md
@@ -1,2 +1,54 @@
 # Testing Sentinel
 
+```{todo}
+This section needs to be fleshed-out, and is an import of the old `README.md`
+right now.
+```
+
+## Run Tests
+
+```
+pdm test
+```
+
+or
+
+```
+pdm test-quick
+```
+
+The above will invoke `pytest` and test Sentinel against handcrafted examples,
+as well as the riscv-test repo binaries. See the `README.md`
+in `tests/upstream` for information on how to refresh the binaries.
+
+Right now (11/5/2023), the difference between `test` and `test-quick` is
+minimal.
+
+## Run RISC-V Formal Flow
+
+```
+pdm rvformal-all [-n num_cores]
+```
+
+or
+
+```
+pdm rvformal test-name
+```
+
+See `README.md` in `tests/formal` for more information, including
+valid/available test names.
+
+## Run RISCOF Flow
+
+```
+pdm riscof-all
+```
+
+or
+
+```
+pdm riscof-override /path/to/test_list.yaml
+```
+
+See `README.md` in `tests/riscof` for more information.
diff --git a/doc/index.md b/doc/index.md
@@ -16,8 +16,10 @@ Public API <usage/reference>
 development/overview
 development/internals
 development/microcode
+development/support-code
 development/testing
 development/guidelines
+CHANGELOG <changes>
 TODO List <todo>
 ```