Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make all GTH tx interfaces operate in the same domain #688

Merged
merged 7 commits into from
Dec 5, 2024

Conversation

leonschoorl
Copy link
Contributor

@leonschoorl leonschoorl commented Dec 2, 2024

This PR moves all transceiver TX interfaces to the same domain, instead of separate domains with (potentially) small phase offsets.

Note that the RX interfaces are in separate domains. (by design)

Background

In typical applications either:

  • All transmit links are on independent domains and all receive links are on independent links.
  • All transmit links are on the same domain and all receive links are on the same domain. This is useful in situations where we're (for example) building a single 400 gbit/s link out of multiple 100 gbit/s links.

Bittide is unique (?) in that we want an asymmetric setup: all the transmit domains should be the (exact) same, while receive domains remain independent. Because this is not covered by the typical use cases, we need to manually implement clock logic instead of relying on high-level Xilinx primitives.

Up to now, we've configured the transceivers as if all links were independent. Still, we provided the same reference clock to all individual links. In practice this would therefore amount to the asymmetric case. The problem is that static timing analysis will not apply across these domains, and we would therefore need clock domain crossing logic (FIFOs) to satisfy our design rule checks. This is for good reasons though: if we cannot apply STA, we cannot guarantee the sequential/digital abstraction and therefore risk failing bitstreams.

Up to now, this hasn't mattered much: for our experiments we haven't critically relied on the transmit domain being exactly equal. With the on-FPGA switch demo around the corner, we would like to get rid of the separate domains, hence this PR.

Old situation

Transceivers-Old drawio

New situation

Transceivers-New drawio

Future work

There is now a whole bunch of redundant code dealing with n domains, which can be simplified. Preferably we'd do this after merging #654, as that PR will end up deleting a lot of code anyway.

Copy link
Contributor

@martijnbastiaan martijnbastiaan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM! I've got a few questions.

Edit: I've added a PR cover letter including diagrams.

bittide/src/Clash/Cores/Xilinx/GTH/BlackBoxes.hs Outdated Show resolved Hide resolved
Comment on lines 162 to 165
(N.Clock _, N.Clock _) -> True
(N.ClockN _, N.ClockN _) -> True
(N.Reset _, N.Reset _) -> True
(N.Enable _, N.Enable _) -> True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't this compare domains? Could you add the explanation in a comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this merged, add a comment in another PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this functions gets something like: Clock rxUser2 and Clock GthRx
So the right side is a concrete monomorphic type, but the left side still contains a type variable.

If you really want to properly check this, you'd have to do unification of the type variables over all the ports together.
And you can't do a simple per port check like this.

bittide/src/Clash/Cores/Xilinx/GTH/Internal.hs Outdated Show resolved Hide resolved
bittide/src/Clash/Cores/Xilinx/GTH/Internal.hs Outdated Show resolved Hide resolved
bittide/tests/Tests/Transceiver.hs Outdated Show resolved Hide resolved
bittide/tests/Tests/Transceiver.hs Outdated Show resolved Hide resolved
bittide-shake/src/Clash/Shake/Vivado.hs Outdated Show resolved Hide resolved
bittide-shake/src/Clash/Shake/Vivado.hs Show resolved Hide resolved
Comment on lines +200 to +201
{ txOutClock :: Clock tx1
-- ^ TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can delete this right? The clock is generated outside of this block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added #696

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

txOutClock can not be deleted.
It's only connected one of the channels.
But on that one it's the source that feeds the "user clock network", which is what creates the txClock(2) used for everything.

Comment on lines +214 to +215
, rxOutClock :: Clock rx1
-- ^ TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added #696

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar as for the tx side, but here all of them are actually used individually because all domains are different.

@martijnbastiaan
Copy link
Contributor

martijnbastiaan commented Dec 4, 2024

Before merging: please squash the formatting commits into parent(s).

Edit: done

@martijnbastiaan
Copy link
Contributor

martijnbastiaan commented Dec 5, 2024

Intermittent failures for link related tests. Something is wrong..

Edit: happy to report that switching to the Xilinx RX/TX black boxes plus noReset fixes the intermittent failures:

Screenshot from 2024-12-05 16-37-57

Let's keep the handwritten implementation in the history in case we need it
@martijnbastiaan martijnbastiaan merged commit 0b84e96 into staging Dec 5, 2024
28 checks passed
@martijnbastiaan martijnbastiaan deleted the one-tx-domain branch December 5, 2024 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants