Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config: set minimal hartid width to 6 #2966

Merged
merged 2 commits into from
May 14, 2024
Merged

Config: set minimal hartid width to 6 #2966

merged 2 commits into from
May 14, 2024

Conversation

Tang-Haojin
Copy link
Member

This can help users who only build one core but then manually instantiate more than two cores in the SoC.

@Tang-Haojin Tang-Haojin requested a review from cyyself May 10, 2024 09:50
@cyyself
Copy link
Member

cyyself commented May 10, 2024

Thanks for doing this!
Looks good to me.

We may also need a config to split XSTop and XSCore to change something outside of XSCore faster. Just leave it as future work.

@poemonsense
Copy link
Member

poemonsense commented May 10, 2024

We may also need a config to split XSTop and XSCore to change something outside of XSCore faster. Just leave it as future work.

The major issue here is the diplomacy. Getting the correct parameters for outside XSCore requires all LazyModule nodes. We have to use fake nodes (and make sure they are the same as XSCore) if we want to separate XSCore and uncore.

Actually we have a transform to copy Core 0 to other cores. This reduces the FIRRTL -> verilog time but does not help the Chisel elaboration stage.

We don't know how to fix the issue better. cc @sequencer

@cyyself
Copy link
Member

cyyself commented May 10, 2024

We may also need a config to split XSTop and XSCore to change something outside of XSCore faster. Just leave it as future work.

The major issue here is the diplomacy. Getting the correct parameters for outside XSCore requires all LazyModule nodes. We have to use fake nodes (and make sure they are the same as XSCore) if we want to separate XSCore and uncore.

Indeed.

Actually we have a transform to copy Core 0 to other cores. This reduces the FIRRTL -> verilog time but does not help the Chisel elaboration stage.

Another way is to use LazyModule cloning, but we need to fix ChiselDB and ConstantIn first.

@poemonsense
Copy link
Member

but we need to fix ChiselDB and ConstantIn first.

Yes. No matter how we address the multicore issue, we need to ensure they are the same. ChiselDB and ConstantIn are destroying the dedup between multiple cores.

@sequencer
Copy link
Contributor

We don't know how to fix the issue better. cc @sequencer

Please use D/I... I'm prototyping the new diplomacy framework, but that needs a large refactor...

@cyyself
Copy link
Member

cyyself commented May 11, 2024

This can help users who only build one core but then manually instantiate more than two cores in the SoC.

Please wait, this line in cpl2 should also be fixed https://github.com/OpenXiangShan/CoupledL2/pull/102/files#diff-566b098f719cfe2cdea01bb41cd67c977a43516affa844e0512e321dd742170bR83

@cyyself cyyself self-requested a review May 11, 2024 06:13
@sequencer
Copy link
Contributor

sequencer commented May 11, 2024

From my mind:

  • All LazyModule should only be a SerializableModule that only depends on a SerializableModuleParameter. In this case, all SerializableModule can be safely instantiated via D/I. and is cacheable in Chisel.
  • All LazyModule should be a FixedIOModule that can instantiate its Interface(IO) before the Module elaboration.

The current implementation in diplomacy has these design warts, which won't be supported:

  • CDE.
  • implicit parameters.
  • In-memory linking from Edge to Bundle.

I'm drafting a NoC-based SoC framework w/ this methodology, eta 6m to get it ready.

@cyyself
Copy link
Member

cyyself commented May 11, 2024

I found a bug since the last time I modified CoupledL2:

src/main/scala/xiangshan/L2Top.scala:91

  val l2cache = coreParams.L2CacheParamsOpt.map(l2param =>
    LazyModule(new CoupledL2()(new Config((_, _, _) => {
      case L2ParamKey => l2param.copy(
          hartIds = Seq(p(XSCoreParamsKey).HartId),

Then, l2cache will only get a hartId sized 1.

But we use the size of this List to construct hartId in CoupledL2:

coupledL2/src/main/scala/coupledL2/CoupledL2.scala:84

  lazy val msgSizeBits = edgeIn.bundle.sizeBits
  lazy val sourceIdAll = 1 << sourceIdBits

  lazy val hartIdLen: Int = log2Up(cacheParams.hartIds.length)

  val mshrsAll = cacheParams.mshrs

So, currently, the hartIdLen in coupledL2 is wrong.

I wanted to fix it today, but then I found many things that should be fixed first, especially for ConstantIn and ChiselDB support for hartid from io. If you want to use it temporarily, you can just apply the following patches with your branch:

Xiangshan:

diff --git a/src/main/scala/top/Top.scala b/src/main/scala/top/Top.scala
index 11b5b8806..cab8ad853 100644
--- a/src/main/scala/top/Top.scala
+++ b/src/main/scala/top/Top.scala
@@ -27,6 +27,7 @@ import device._
 import chisel3.stage.ChiselGeneratorAnnotation
 import org.chipsalliance.cde.config._
 import freechips.rocketchip.diplomacy._
+import freechips.rocketchip.tile._
 import freechips.rocketchip.tilelink._
 import freechips.rocketchip.jtag.JTAGIO
 import chisel3.experimental.{annotate, ChiselAnnotation}
@@ -74,6 +75,7 @@ class XSTop()(implicit p: Parameters) extends BaseXSSoc() with HasSoCParameter
         hartIds = tiles.map(_.HartId),
         FPGAPlatform = debugOpts.FPGAPlatform
       )
+      case MaxHartIdBits => p(MaxHartIdBits)
     })))
   )
 
diff --git a/src/main/scala/xiangshan/L2Top.scala b/src/main/scala/xiangshan/L2Top.scala
index 7102237d3..182b87c8f 100644
--- a/src/main/scala/xiangshan/L2Top.scala
+++ b/src/main/scala/xiangshan/L2Top.scala
@@ -21,7 +21,7 @@ import org.chipsalliance.cde.config._
 import chisel3.util.{Valid, ValidIO}
 import freechips.rocketchip.diplomacy._
 import freechips.rocketchip.interrupts._
-import freechips.rocketchip.tile.{BusErrorUnit, BusErrorUnitParams, BusErrors}
+import freechips.rocketchip.tile.{BusErrorUnit, BusErrorUnitParams, BusErrors, MaxHartIdBits}
 import freechips.rocketchip.tilelink._
 import coupledL2.{CoupledL2, L2ParamKey}
 import system.HasSoCParameter
@@ -94,6 +94,7 @@ class L2Top()(implicit p: Parameters) extends LazyModule
           hartIds = Seq(p(XSCoreParamsKey).HartId),
           FPGAPlatform = debugOpts.FPGAPlatform
         )
+      case MaxHartIdBits => p(MaxHartIdBits)
     })))
   )
   val l2_binder = coreParams.L2CacheParamsOpt.map(_ => BankBinder(coreParams.L2NBanks, 64))

submodule CoupledL2

diff --git a/src/main/scala/coupledL2/CoupledL2.scala b/src/main/scala/coupledL2/CoupledL2.scala
index e6cf2ae..4e743f9 100644
--- a/src/main/scala/coupledL2/CoupledL2.scala
+++ b/src/main/scala/coupledL2/CoupledL2.scala
@@ -23,6 +23,7 @@ import chisel3._
 import chisel3.util._
 import utility.{FastArbiter, ParallelMax, ParallelPriorityMux, Pipeline, RegNextN}
 import freechips.rocketchip.diplomacy._
+import freechips.rocketchip.tile.MaxHartIdBits
 import freechips.rocketchip.tilelink._
 import freechips.rocketchip.tilelink.TLMessages._
 import freechips.rocketchip.util._
@@ -83,7 +84,7 @@ trait HasCoupledL2Parameters {
   lazy val msgSizeBits = edgeIn.bundle.sizeBits
   lazy val sourceIdAll = 1 << sourceIdBits
 
-  lazy val hartIdLen: Int = log2Up(cacheParams.hartIds.length)
+  lazy val hartIdLen: Int = p(MaxHartIdBits)
 
   val mshrsAll = cacheParams.mshrs
   val idsAll = 256// ids of L2 //TODO: Paramterize like this: max(mshrsAll * 2, sourceIdAll * 2)

submodule HuanCun

diff --git a/src/main/scala/huancun/HuanCun.scala b/src/main/scala/huancun/HuanCun.scala
index 5722a71..52f9eb9 100644
--- a/src/main/scala/huancun/HuanCun.scala
+++ b/src/main/scala/huancun/HuanCun.scala
@@ -23,6 +23,7 @@ import org.chipsalliance.cde.config.Parameters
 import chisel3._
 import chisel3.util._
 import freechips.rocketchip.diplomacy._
+import freechips.rocketchip.tile.MaxHartIdBits
 import freechips.rocketchip.tilelink._
 import freechips.rocketchip.tilelink.TLMessages._
 import freechips.rocketchip.util.{BundleField, BundleFieldBase, UIntToOH1}
@@ -103,7 +104,7 @@ trait HasHuanCunParameters {
 
   lazy val outerSinkBits = edgeOut.bundle.sinkBits
 
-  lazy val hartIdLen: Int = log2Up(cacheParams.hartIds.length)
+  lazy val hartIdLen: Int = p(MaxHartIdBits)
 
   val block_granularity = if (!cacheParams.inclusive && cacheParams.clientCaches.nonEmpty) {
     cacheParams.clientCaches.head.blockGranularity

@poemonsense
Copy link
Member

I found many things that should be fixed first, especially for ConstantIn and ChiselDB support for hartid from io.

FYI, #2672

@poemonsense
Copy link
Member

Also, by default, both ChiselDB and constantin are disabled. We can skip them first and fix them in the next step.

@Tang-Haojin
Copy link
Member Author

Also, by default, both ChiselDB and constantin are disabled. We can skip them first and fix them in the next step.

There is an AlwaysBasicDB which mainly used by tllog. It may make dedup fail.

cyyself added a commit to OpenXiangShan/CoupledL2 that referenced this pull request May 11, 2024
Before this commit, we use HartIds: Seq[Int] in L2Param. However, it
will only contains one element since the code are copied from L3, and
L2 is private to a core now, which confuse some develops and caused a
bug [1]. To prevent it being happend again, we should use a single
HartId here and refactor many its consumes to use single HartId.

[1] OpenXiangShan/XiangShan#2966 (comment)

Signed-off-by: Yangyu Chen <[email protected]>
cyyself added a commit to OpenXiangShan/CoupledL2 that referenced this pull request May 11, 2024
Before this commit, we use HartIds: Seq[Int] in L2Param. However, it
only contain one element since the code is copied from L3, and L2 is now
private to a core, which confuses some developers and causes a bug [1].
To prevent it from happening again, we should use a single HartId here
and refactor many of its consumers to use a single HartId.

[1] OpenXiangShan/XiangShan#2966 (comment)

Signed-off-by: Yangyu Chen <[email protected]>
Ivyfeather pushed a commit to OpenXiangShan/CoupledL2 that referenced this pull request May 12, 2024
* configs: use single hartid

Before this commit, we use HartIds: Seq[Int] in L2Param. However, it
only contain one element since the code is copied from L3, and L2 is now
private to a core, which confuses some developers and causes a bug [1].
To prevent it from happening again, we should use a single HartId here
and refactor many of its consumers to use a single HartId.

[1] OpenXiangShan/XiangShan#2966 (comment)

Signed-off-by: Yangyu Chen <[email protected]>

* Bump HuanCun

Signed-off-by: Yangyu Chen <[email protected]>

---------

Signed-off-by: Yangyu Chen <[email protected]>
@XiangShanRobot
Copy link

[Generated by IPC robot]
commit: f4b08c2

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
f4b08c2 1.854 0.449 2.103 1.182 2.170 2.181 2.329 0.973 1.386 1.294 2.742 2.605 2.283 2.951

master branch:

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
1e018fb 1.854 0.450 2.103 1.190 2.468 2.593 2.330 0.962 1.377 1.427 3.123 2.639 2.451 2.960
3c71816 1.854 0.450 2.103 1.190 2.468 2.593 2.330 0.962 1.377 1.427 3.123 2.639 2.451 2.960
dfe034b 1.854 0.449 2.103 1.182 2.170 2.181 2.328 0.972 1.386 1.294 2.742 2.592 2.283 2.951
bdc1606 1.854 0.449 2.101 1.182 2.170 2.181 2.329 0.973 1.386 1.294 2.742 2.589 2.283 2.951
dc5a918 1.854 0.449 2.103 1.181 2.170 2.181 2.329 0.973 1.386 1.294 2.742 2.589 2.283 2.951
bad6084 1.854 0.449 2.103 1.182 2.170 2.181 2.329 0.972 1.386 1.294 2.742 2.589 2.283 2.951
c686adc 1.854 0.449 2.103 1.182 2.170 2.181 2.335 0.972 1.386 1.294 2.742 2.589 2.283 2.951
bc3d558 1.854 0.449 2.100 1.182 2.170 2.181 2.329 0.972 1.380 1.294 2.742 2.589 2.283 2.951
a58f171 1.854 0.449 2.104 1.182 2.170 2.181 2.329 0.972 1.386 1.294 2.742 2.586 2.283 2.951
ff74867 1.898 0.448 2.105 1.186 2.173 2.175 2.335 0.960 1.372 1.288 2.745 2.583 2.285 2.958

Tang-Haojin and others added 2 commits May 13, 2024 11:11
This can help users who only build one core but
then manually instantiate more than two cores in
the SoC.
Since we have set the default HartIdBits to 6, we should also allow it to
be overridden by parameters.

Signed-off-by: Yangyu Chen <[email protected]>
@XiangShanRobot
Copy link

[Generated by IPC robot]
commit: 0837690

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
0837690 1.854 0.450 2.103 1.190 2.468 2.593 2.330 0.962 1.377 1.427 3.123 2.639 2.451 2.960

master branch:

commit astar copy_and_run coremark gcc gromacs lbm linux mcf microbench milc namd povray wrf xalancbmk
5e237ba
363530d
a72b131
05d833a 1.854 0.450 2.103 1.190 2.468 2.593 2.330 0.961 1.377 1.427 3.123 2.639 2.451 2.960
9cb05b4 0.450 2.103 2.330 1.377 2.639
4daa5bf 0.450 2.103 1.190 2.468 2.593 2.329 0.962 1.377 1.427 3.123 2.639 2.960
1e018fb 1.854 0.450 2.103 1.190 2.468 2.593 2.330 0.962 1.377 1.427 3.123 2.639 2.451 2.960
3c71816 1.854 0.450 2.103 1.190 2.468 2.593 2.330 0.962 1.377 1.427 3.123 2.639 2.451 2.960
dfe034b 1.854 0.449 2.103 1.182 2.170 2.181 2.328 0.972 1.386 1.294 2.742 2.592 2.283 2.951
bdc1606 1.854 0.449 2.101 1.182 2.170 2.181 2.329 0.973 1.386 1.294 2.742 2.589 2.283 2.951

@Tang-Haojin Tang-Haojin merged commit b628978 into master May 14, 2024
4 checks passed
@Tang-Haojin Tang-Haojin deleted the thj-p1 branch May 14, 2024 03:24
Diana0525 pushed a commit to Diana0525/XiangShan that referenced this pull request May 17, 2024
This can help users who only build one core but then manually
instantiate more than two cores in the SoC.

---------

Signed-off-by: Yangyu Chen <[email protected]>
Co-authored-by: Yangyu Chen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants