Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RGMII core #7

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Add RGMII core #7

wants to merge 3 commits into from

Conversation

t-wallet
Copy link
Collaborator

Adds a module that allows us to interface with an RGMII PHY, both via ordinary signals and via the PacketStream streaming protocol.

Co-authored-by: Jasmijn Bookelmann <[email protected]>
Co-authored-by: Cato van Ojen <[email protected]>
Co-authored-by: MatthijsMu <[email protected]>
Co-authored-by: Jasper Laumen <[email protected]>
Co-authored-by: Mart Koster <[email protected]>
Co-authored-by: Bryan Rinders <[email protected]>
Co-authored-by: Daan Weessies <[email protected]>
Co-authored-by: Rowan Goemans <[email protected]>
Co-authored-by: syberant <[email protected]>
@rowanG077
Copy link
Member

rowanG077 commented Aug 27, 2024

I would upstream that under the ethernet core. It has no use beyond ethernet afaik.

@DigitalBrains1
Copy link
Member

I would upstream that under the ethernet core. It has no use behond ethernet afaik.

Right, perhaps the name Clash.Cores.Ethernet.RGMII makes more sense. We don't want to nest too deep, but that seems okay.

Also, we are now maintaining a Changelog for clash-cores? I don't think that necessarily makes sense. We havent' released an initial version. The concept of a change does not begin until after the first release. You could do a first entry that reads "this release includes these cores:", but to write "Added this-and-that core" doesn't make sense. "Added" implies that there was something before that did not have it.

Copy link
Member

@lmbollen lmbollen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial thoughts :)

Comment on lines 33 to 53
-- | RX channel from the RGMII PHY
data RgmiiRxChannel dom domDDR = RgmiiRxChannel
{ rgmiiRxClk :: "rx_clk" ::: Clock dom
, rgmiiRxCtl :: "rx_ctl" ::: Signal domDDR Bit
, rgmiiRxData :: "rx_data" ::: Signal domDDR (BitVector 4)
}

instance Protocol (RgmiiRxChannel dom domDDR) where
type Fwd (RgmiiRxChannel dom domDDR) = RgmiiRxChannel dom domDDR
type Bwd (RgmiiRxChannel dom domDDR) = Signal dom ()

-- | TX channel to the RGMII PHY
data RgmiiTxChannel domDDR = RgmiiTxChannel
{ rgmiiTxClk :: "tx_clk" ::: Signal domDDR Bit
, rgmiiTxCtl :: "tx_ctl" ::: Signal domDDR Bit
, rgmiiTxData :: "tx_data" ::: Signal domDDR (BitVector 4)
}

instance Protocol (RgmiiTxChannel domDDR) where
type Fwd (RgmiiTxChannel domDDR) = RgmiiTxChannel domDDR
type Bwd (RgmiiTxChannel domDDR) = Signal domDDR ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there two separate protocols for RX and TX?
We could also have a single

data Rgmii dom domDDR = Rgmii
  { clk :: "clk" ::: Clock dom
  , ctl :: "ctl" ::: Signal domDDR Bit
  , data :: "data" ::: Signal domDDR (BitVector 4)
  }

Right?
We would require OverloadedRecordDot together with DuplicateRecordFields and NoFieldSelectors

Copy link
Member

@rowanG077 rowanG077 Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because on the rx side you receive a clock signal. But on the tx side you have to forward the clock using an ODDR primitive. From the FPGA view it's actually no longer clock and can't be used as such.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't you basically abusing the ODDR primitive here as a clock generator? Maybe we should add a wrapper for the ODDR primitive that outputs a Clock dom rather than a Signal domDDR Bit (note difference in domain, factor two period difference).

I don't think there's anything wrong with using an ODDR primitive to generate a clock. But Clash is strongly typed and while there is a certain isomorphism, I still feel this outputs a Clock, not a Signal. It is used as a time reference rather than being relative to one. The Signal domDDRs have a setup and hold constraint, the clock output does not. And more arguments revolving around the same principle.

Copy link
Member

@DigitalBrains1 DigitalBrains1 Sep 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a function

oddrClockGen ::
  forall dom domDdr .
  (DomainPeriod dom ~ (*) 2 (DomainPeriod domDdr)) =>
  Clock dom ->
  Reset dom ->  
  ( Clock dom ->
    Reset dom ->
    Enable dom ->
    Bit ->
    Signal dom Bit ->
    Signal domDdr Bit
  ) ->
  Clock dom

where we construct the function such that in HDL, it outputs the output of the DDR register while in Haskell it just outputs a Clock.

[edit]
If we add an internal helper primitive that is:

ddrSignalToClock ::
  forall dom domDdr .
  (DomainPeriod dom ~ (*) 2 (DomainPeriod domDdr)) =>
  Clock dom ->
  Signal domDdr Bit ->
  Clock dom
ddrSignalToClock clk !_ = clk

where the black box instead picks the DDR Bit signal as the output and ignores the clock input, the previous function becomes trivial. This internal primitive could go into Clash.Signal.Internal, and oddrClockGen could be added to Clash.Explicit.DDR.
[/edit]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the FPGA view it's actually no longer clock and can't be used as such.

I don't think you can use the output of the ODDR primitive, period. You can only wire it to a pin, not to any internal logic. So that remains the same whether it's a Signal or a Clock: it cannot be used at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True perhaps it's better to make it neither a clock or a signal. But rather something like WorldSignal or something. Which is just a newtype around signal which you can only route.

Copy link
Member

@DigitalBrains1 DigitalBrains1 Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds more like a GitHub wishlist issue, for the short term I'd like to propose to add oddrClockGen to clash-prelude and change the type here. If I understand correctly, that'd also make it so you can have one Protocol with both RX and TX in it, which would be a nice improvement, right?

(Provided I can convince other people working on clash-prelude that this is a good thing to add)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the status on this? I like the idea of having a Signal type that can only be routed. Changing it to a clock doesn't make sense to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The status from my point of view is that I was waiting for consensus or some form of decision of what to do.

Why doesn't the clock make sense to you? To me, the signal doesn't really make sense for the reasons I already indicated. The output of the DDR primitive is used as a clock for external circuitry and that is its only purpose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say just outputting Clock dom is not qualitatively different than outputting the ODDR-generated clock; it's just a different construction inside the FPGA that makes stuff easier. Even Xilinx recommends it here; one of the first search results I found.

src/Clash/Cores/Rgmii.hs Outdated Show resolved Hide resolved
src/Clash/Cores/Rgmii.hs Outdated Show resolved Hide resolved
Comment on lines 143 to 152
{- |
Circuit that adapts an `RgmiiRxChannel` to a `PacketStream`. Forwards data
from the RGMII receiver with one clock cycle latency so that we can properly
mark the last transfer of a packet: if we received valid data from the RGMII
receiver in the last clock cycle and the data in the current clock cycle is
invalid, we set `_last`. If the RGMII receiver gives an error, we set `_abort`.

__UNSAFE__: ignores backpressure, because the RGMII PHY is unable to handle that.
-}
unsafeRgmiiRxC ::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a configuration option that asserts abort when we lose transactions due to backpressure?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pondering whether it will still be unsafe then...
We can also make an issue to make a safe version (it would just drop packets....)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting _abort does not make it safe. Imagine you get backpressure while transmitting the last transfer of a packet, then setting _abort does nothing.

I do not think there is a way to make this safe. If the RX receives backpressure while in the middle of receiving a packet, we have already sent transfers so we cannot drop the packet there anymore. You could maybe add a packet fifo to buffer entire packets to achieve this, but as we are in a 125 MHz clock domain, I think that will ruin our timings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since forward can not depend on backward, backpressure on the last transaction is indeed a problem since you can not set abort anymore.

It would be possible if we could terminate a packet with a 0 byte transaction.

@rowanG077 What is the motivation for making the Just (Index n) contain the index of the last bye rather than the number of bytes in the transaction?

Copy link
Member

@rowanG077 rowanG077 Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Precisely so 0 byte transactions are not possible. You are basically saying. Hello I have something for you. Oh please show me what you have. And then it's nothing.

If you use a skid buffer(registerFwd) it's possible to allow forward to depend on backwards. I would even add this to the documentation of the skid buffers in clash-protocols. It's one of the killer features of a skid buffer imo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you want to explicitly disallow 0 byte termination transfers?

Now any master must postpone its transaction until it knows more data is coming or until it knows "this" is the end of the packet.

Copy link
Member

@rowanG077 rowanG077 Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because a zero byte transfer just adds overhead for I think little reason. You have add another state to handle in all PacketStream components. It's also ambiguous when it occurs. For example when depacketizing should you add a zero byte termination? Can zero byte terminations be stripped of? Which for example is something you'd want to know in PacketFifo and even is essential for upConverter. And probably more questions... It just makes the protocol more complex.

Why is it a problem that the RGMII PacketStream master must postpone its transaction? All *MII interfaces I know behave this way. You get some bits and when valid falls it's
the end of packet. RGMII is even special in that regard that it transfers full bytes per clock cycle but others don't. RMII is 2 bits per cycle for example.

Copy link
Member

@rowanG077 rowanG077 Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand an UDP packet without payload is valid, same for TCP as well. So perhaps we need this extra complexity.

Do you have some thoughts @t-wallet?

Copy link
Collaborator Author

@t-wallet t-wallet Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can explore adding zero-data transactions, I think the fact that it can only happen on a packet boundary should make it a little easier to implement. Indeed with the current setup handling UDP packets without payload is impossible.

Another benefit of that is that we can remove depacketizeToDfC :) (Edit: that's incorrect, as we need to consume padding for some use cases, which depacketizerC obviously does not do)

Copy link
Collaborator Author

@t-wallet t-wallet Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty UDP packets are not a problem for Ethernet because of padding. But, we should of course not rely on padding as maybe some other protocols have bigger headers or don't do padding.

I have experimented with changing the type of _last to Maybe (Index (dataWidth + 1)), so that its meaning becomes "the amount of valid bytes in the transaction". It was surprisingly easy to implement for all the generic components in clash-protocols: it only took me 2 hours to adjust everything. I didn't test it, but I suspect the extra resource overhead is very minimal. So, it's looking very promising. The next step is to also adapt our Ethernet components, and test whether it works in hardware.

@t-wallet t-wallet force-pushed the main branch 2 times, most recently from 55a2c84 to c3ad532 Compare August 28, 2024 10:09
src/Clash/Cores/Rgmii.hs Outdated Show resolved Hide resolved
Comment on lines 65 to 71
( forall a.
(NFDataX a, BitPack a) =>
Clock dom ->
Reset dom ->
Signal domDDR a ->
Signal dom (a, a)
) ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note this function doesn't match the signature of Clash.Explicit.DDR.ddrIn. To use it, one would need to do something like:

... ddrInComp ...
 where
    ddrInComp = \clk rst -> ddrIn clk rst enableGen (deepErrorX "Undefined initial value")

I'd suggest adding Enable and reset value to the iddr function. Same thing for oddr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants