Skip to content

Commit

Permalink
Simplify README.md and migrate to online/Sphinx docs.
Browse files Browse the repository at this point in the history
  • Loading branch information
cr1901 committed Nov 26, 2024
1 parent 1241826 commit 291e268
Show file tree
Hide file tree
Showing 10 changed files with 569 additions and 326 deletions.
350 changes: 60 additions & 290 deletions README.md

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions doc/changes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
```{include} ../CHANGELOG.md
```
143 changes: 143 additions & 0 deletions doc/development/internals.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,3 +107,146 @@ using the components from the {py:mod}`~sentinel.align` module.

Aside from aligning, the glue logic for latching addresses, read data, and
write data is minimal and controlled directly by microcode signals. -->

## Instruction Cycle Counts

```{todo}
I need to create a test that gets latency and throughput for each instruction
type of the core.
```

The following counts are general observations (as of 11/18/2023), from
examining the microcode (knowing that each microcode instruction always takes
1 clock cycle):

* _There is room for improvement, even without making the core bigger._
* Fetch/Decode takes a _minimum_ of two cycles thanks to Wishbone classic's
REQ/ACK handshake taking two cycles.
* When Wishbone ACK is asserted, Decode is taking place.
* The GP file is a synchronous single read port, single write port. Sentinel
loads RS1 out of the register file during Decode.
* All instructions share the same operation the cycle after ACK/Decode:
* Check for exceptions/interrupts, go to exception handler if so.
* Latch RS1 into the ALU.
* Load RS2 out of the register file, in anticipation for a "simple"
instruction.
* Jump to the instruction-specific microcode block.
* At minimum, an instruction (`addi`, `or`, etc) takes 3 cycles to retire
after the initial shared cycles. This means Sentinel instructions have a
minimum latency of 6 cycles per instruction (CPI).
* Sentinel instructions have a maximum throughput of 4 CPI by overlapping the
2 Fetch/Decode cycles of the _next_ instruction after the initial 3 shared
cycles of the _current_ instruction when possible ("pipelining").
* Some instructions overlap one of the Fetch/Decode cycles, some don't
overlap either of them. In particular, shift instructions with a nonzero
shift count don't pipeline Fetch/Decode. It may be possible to _always_
overlap at least one cycle, but I haven't tweaked the core yet to ensure
this is a sound optimization.
* _Shift instructions need work_:
* For a shift of zero, shift-immediate latency is 10 CPI, throughtput 9 CPI.
Shift-register latency is 11 CPI, throughput 10 CPI.
* For a shift of nonzero `n`, shift-immediate _and_ shift-register latency
and throughput is 7 + 2*`n` CPI.
* Branch-not-taken latency and throughput is 7 CPI. Branch-taken latency and
throughput is 8 CPI.
* JAL/JALR latency is 9 CPI, throughput is 7 CPI.
* Store latency and throughput is 8 CPI minimum. 2 cycles minimum are spent
waiting for Wishbone ACK.
* The core will not release STB/CYC between the store and fetch of the next
instruction.
* Load latency is 10 CPI minimum, and throughput is 9 CPI. 2 cycles minimum
are spent waiting for Wishbone ACK.
* The core _will_ release STB/CYC before fetch of the next instruction.
* CSR instructions require an extra Decode cycle compared to all other
instructions (to check for legality).
* At minimum, a read of a read-only zero CSR register has a latency of 7 CPI,
and a throughput of 6 CPI.
* At maximum, `csrrc` has a latency of 11 CPI, and a throughput of 10 CPI.
* Entering an exception handler requires 5 clocks from the cycle at which
the exception condition is detected.
* `mret` has a latency and throughput of 8 CPI.

## CSRs

Sentinel physically implements the following CSRs:

* `mscratch`
* `mcause`
* The core can only physically trigger a subset of defined exceptions:
* Machine external interrupt
* Instruction access misaligned
* Illegal instruction
* Breakpoint
* Load address misaligned
* Store address misaligned
* Environment call from M-mode

In particular worth noting:
* _Misaligned accesses are not implemented in hardware._
* There is no machine timer (a 64-bit counter is a bit too much to
ask for right now :(...).
* `mip`
* Only the `MEIP` bit is implemented. The RISC-V Privileged Spec says:

> `MEIP` is read-only in `mip`, and is set and cleared by a
> platform-specific interrupt controller.
The user must provide their own interrupt controller. See
{class}`sentinel.top.Top`. One simple implementation is to `OR` all
external interrupt sources together, and query each peripheral when `MEIP`
is pending to find which peripherals need attention. This is implemented
for the serial and timer peripherals in the `examples/attosoc.py` example.

```{note}
In the future, I may implement the high (platform-specific) 16-bits of
`mip`/`mie` to make interrupt-handling quicker.
```
* `mie`
* Only the `MEIE` bit is implemented.
* `mstatus`
* Only the `MPP`, `MPIE`, and `MIE` bits are implemented.
* `mtvec`
* The `BASE` is writeable; only the Direct `MODE` setting is implemented.
```{todo}
A read-only `BASE` is allowed, but I believe the Rust [support code](./support-code.md)
assumes a writable `BASE`. I don't wish to fork [`riscv-rt`](https://github.com/rust-embedded/riscv/tree/master/riscv-rt)
solely for a read-only `BASE`. So I deal with the potential loss of space
savings for now.
Revisit whether read-only `BASE` is feasible in the future.
```
* `mepc`
The following CSRs are implemented as read-only zero and trigger an exception
on an attempt to write:
* `mvendorid`
* `marchid`
* `mimpid`
* `mhartid`
* `mconfigptr`
The following CSRs are implemented as read-only zero (no exception on write):
* `misa`
* `mstatush`
* `mcountinhibit`
* `mtval`
* `mcycle`
* `minstret`
* `mhpmcounter3-31`
* `mhpmevent3-31`
All remaining machine-mode CSRs are unimplemented and trigger an exception on
_any_ access:
* `medeleg`
* `mideleg`
* `mcounteren`
* `mtinst`
* `mtval2`
* `menvcfg`
* `menvcfgh`
* `mseccfg`
* `mseccfgh`
23 changes: 19 additions & 4 deletions doc/development/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ PDM and DoIt complement each other:
I leverage both PDM and DoIt for increased flexibility to run tests, benchmarks,
and generate examples.

(pdm-scripts)=
## PDM Scripts

Scripts defined in `[tool.pdm.scripts]` are the main
Expand Down Expand Up @@ -73,15 +72,18 @@ scripts must pass as part of CI:
* `gen`, `demo`, `demo-rust`: Check that code/demo generation works.
* `test-quick`, `rvformal-all`, and `riscof-all`: Run tests.


```{todo}
Although [ReadTheDocs](https://sentinel-cpu.readthedocs.io/en/latest/) handles
building docs at present, CI release should be gated on building docs
successfully.
```

If necessary, the above PDM scripts invoke `doit`, which reads the `dodo.py`
file to find out how to do the actual work.[^1]

(doit-tasks)=
## DoIt Tasks

`doit` tasks are "low-level" tasks wrapped by the PDM scripts {ref}`above <pdm-scripts>`.
`doit` tasks are "low-level" tasks wrapped by the PDM scripts [above](#pdm-scripts)
_They should be treated as a private and subject to change._ Howevever, I provide
a `doit` PDM script to call `doit` directly if necessary. For instance, to
list available `doit` tasks (_including [private tasks](https://pydoit.org/tasks.html#private-hidden-tasks)_),
Expand Down Expand Up @@ -116,6 +118,19 @@ run_sby run symbiyosys flow on Sentinel, "doit list --all run_s
ucode assemble microcode and copy non-bin artifacts to root
```

I've documented the `doit` tasks (but not subtasks) as a courtesy, and to make sure
developers/users don't get stuck. That said, **prefer running `pdm` as a
wrapper to `doit` rather than running `doit` directly.**

```{note}
I make heavy use of [DoIt subtasks](https://pydoit.org/tasks.html#sub-tasks).
These are understandably excluded from default help output. Run
`pdm doit list --all [task]` to see all sub-tasks for a higher-level task.
For instance, the `run_sby:reg_ch0` subtask runs the [Register Check](https://github.com/YosysHQ/riscv-formal/blob/main/docs/procedure.md#register-checks)
for `riscv-formal`.
```

```{todo}
Decide whether to commit to treat all `doit` tasks as private/hidden or expose a set
to a user/developer (no leading underscore). Right now, I am not being consistent.
Expand Down
33 changes: 33 additions & 0 deletions doc/development/support-code.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Support Code

`sentinel-rt` is an _currently-empty_ Rust support crate. If necessary, it will
contain support routines optimized for the Sentinel RISC-V _implementation_
that LLVM or Rust wouldn't generally know about. I have three potential use
cases:

1. Wrappers over custom opcodes[^1] and the slow shift operators :),
2. Related to 1., [`compiler-builtins`](https://github.com/rust-lang/compiler-builtins)
specialization if possible[^2].
3. Runtime/Machine Mode code that is incompatible with the existing
[`riscv-rt`](https://github.com/rust-embedded/riscv/tree/master/riscv-rt),
_but_ compatible with the RISC-V spec[^3].

However, at present, I don't need any special support code, so `sentinel-rt`
is just a reserved crate with example code for demo bitstreams.

```{note}
If I expand demos such that multiple linker scripts are required, the examples
_directory_ will become an examples crate. Do not depend on
`sentinel-rt/examples` being stable; the source root is _already_
a [workspace](https://doc.rust-lang.org/book/ch14-03-cargo-workspaces.html)!
```

## Footnotes

[^1]: A `memcpy` instruction is a good candidate for custom microcoded instruction
with speedups.
[^2]: Not clear to me that this _is_ possible! Just something I thought of.
[^3]: The big one here is that RISC-V permits hardcoding `MTVEC`, but last I
checked, this was not supported. This would likely be a size win, but
I don't want to create a fork of `riscv-rt` just for this one edge case,
so I deal.
52 changes: 52 additions & 0 deletions doc/development/testing.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,54 @@
# Testing Sentinel

```{todo}
This section needs to be fleshed-out, and is an import of the old `README.md`
right now.
```

## Run Tests

```
pdm test
```

or

```
pdm test-quick
```

The above will invoke `pytest` and test Sentinel against handcrafted examples,
as well as the riscv-test repo binaries. See the `README.md`
in `tests/upstream` for information on how to refresh the binaries.

Right now (11/5/2023), the difference between `test` and `test-quick` is
minimal.

## Run RISC-V Formal Flow

```
pdm rvformal-all [-n num_cores]
```

or

```
pdm rvformal test-name
```

See `README.md` in `tests/formal` for more information, including
valid/available test names.

## Run RISCOF Flow

```
pdm riscof-all
```

or

```
pdm riscof-override /path/to/test_list.yaml
```

See `README.md` in `tests/riscof` for more information.
2 changes: 2 additions & 0 deletions doc/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,10 @@ Public API <usage/reference>
development/overview
development/internals
development/microcode
development/support-code
development/testing
development/guidelines
CHANGELOG <changes>
TODO List <todo>
```

Expand Down
Loading

0 comments on commit 291e268

Please sign in to comment.