Ariane¶
-[1]:
-
from a2perf.domains import circuit_training
+
+Ariane¶
+
+
+Environment Creation¶
+from a2perf.domains import circuit_training
import gymnasium as gym
env = gym.make('CircuitTraining-Ariane-v0')
-
-2024-08-15 14:13:40.063385: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
-2024-08-15 14:13:40.197600: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
-2024-08-15 14:13:40.197668: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
-2024-08-15 14:13:40.218491: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
-2024-08-15 14:13:40.269410: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
-To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
-2024-08-15 14:13:40.963834: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
-
Action Space - | Discrete(16384) - |
---|---|
Observation Space - | Dict(‘current_node’: Box(0, 3499, (1,), int32), ‘fake_net_heatmap’: Box(0.0, 1.0, (16384,), float32), ‘is_node_placed’: Box(0, 1, (3500,), int32), ‘locations_x’: Box(0.0, 1.0, (3500,), float32), ‘locations_y’: Box(0.0, 1.0, (3500,), float32), ‘mask’: Box(0, 1, (16384,), int32), ‘netlist_index’: Box(0, 0, (1,), int32)) - |
Reward Range - | (0, 1) - |
Creation - | gym.make(“CircuitTraining-Ariane-v0”) - |
Description¶
-Circuit Training is an open-source framework for generating chip floor plans with distributed deep reinforcement learning. This framework reproduces the methodology published in the Nature 2021 paper:
-A graph placement methodology for fast chip design. Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Wenjie Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Omkar Pathak, Azade Nazi, Jiwoo Pak, Andy Tong, Kavya Srinivasa, William Hang, Emre Tuncer, Quoc V. Le, James Laudon, Richard Ho, Roger Carpenter & Jeff Dean, 2021. Nature, 594(7862), pp.207-212. [PDF]
-At each timestep, the agent must place a single macro onto the chip canvas.
-Action Space¶
-[2]:
-
env.action_space
-
[2]:
-
-Discrete(16384)
-
Circuit Training represents the chip canvas as a grid. The action space corresponds to the different locations that the next macro can be placed onto the canvas. In the Ariane netlist case, the canvas is of size \(128 \times 128\), resulting in \(16384\) possible actions.
-Observation Encoding¶
-[3]:
-
env.observation_space
-
[3]:
-
-Dict('current_node': Box(0, 3499, (1,), int32), 'fake_net_heatmap': Box(0.0, 1.0, (16384,), float32), 'is_node_placed': Box(0, 1, (3500,), int32), 'locations_x': Box(0.0, 1.0, (3500,), float32), 'locations_y': Box(0.0, 1.0, (3500,), float32), 'mask': Box(0, 1, (16384,), int32), 'netlist_index': Box(0, 0, (1,), int32))
-
Optional parameters:¶
+Key |
+|||
---|---|---|---|
Parameter |
+Type |
+Default |
Description |
current_node |
-The node currently being considered for placement |
+||
|
+str |
+path to |
+Path to the input netlist file. Predefined by using |
+
|
+str |
+path to |
+Path to the input initial placement file, used to read grid and canvas size. Predefined by using |
+
|
+str |
+
|
+Main PLC wrapper. |
fake_net_heatmap |
-A representation of estimated connections between nodes |
+||
|
+Callable |
+
|
+A function that creates the |
is_node_placed |
-Indicates which nodes have already been placed on the chip |
+||
|
+str |
+
|
+Options for fast standard cells placement. The |
locations_x |
-The x-coordinates of placed nodes |
+||
|
+Callable |
+
|
+The cost function that, given the |
locations_y |
-The y-coordinates of placed nodes |
+||
|
+int |
+
|
+Global seed for initializing environment features, ensuring consistency across actors. |
mask |
-Indicates which actions are valid in the current state |
+||
|
+int |
+
|
+Netlist index in the model static features. |
netlist_index |
-Identifier for the current netlist being processed |
+||
|
+bool |
+
|
+If set, saves the final placement in |
+
|
+bool |
+
|
+If set, saves the placement if its cost is better than the previously saved placement. |
+
|
+str |
+
|
+The path to save the final placement. |
+
|
+bool |
+
|
+If True, runs coordinate descent to fine-tune macro orientations. Meant for evaluation, not training. |
+
|
+str |
+
|
+Name of the coordinate descent fine-tuned |
+
|
+Optional[tf.Variable] |
+
|
+A |
+
|
+bool |
+
|
+If true, outputs all observation features. Otherwise, only outputs dynamic observations. |
+
|
+str |
+
|
+The sequence order of nodes placed by RL. |
+
|
+bool |
+
|
+If true, saves the snapshot placement. |
+
|
+bool |
+
|
+If true, evaluation also saves the placement even if RL does not place all nodes when an episode is done. |
+
|
+bool |
+
|
+If true, uses the legacy reset method. |
+
|
+bool |
+
|
+If true, uses the legacy step method. |
+
|
+str |
+
|
+Specifies the rendering mode |
Rewards¶
-The reward is evaluated at the end of each episode. The placement cost binary is used to calculate the reward based on proxy wirelength, congestion, and density. An infeasible placement results in a reward of -1.0.
+Description¶
+Circuit Training is an open-source framework for generating chip floor plans +with distributed deep reinforcement learning. This framework reproduces the +methodology published in the Nature 2021 paper:
+A graph placement methodology for fast chip design. Azalia Mirhoseini, Anna +Goldie, Mustafa Yazgan, Joe Wenjie Jiang, Ebrahim Songhori, Shen Wang, +Young-Joon Lee, Eric Johnson, Omkar Pathak, Azade Nazi, Jiwoo Pak, Andy Tong, +Kavya Srinivasa, William Hang, Emre Tuncer, Quoc V. Le, James Laudon, Richard +Ho, Roger Carpenter & Jeff Dean, 2021. Nature, 594(7862), pp.207-212. [PDF]
+At each timestep, the agent must place a single macro onto the chip canvas.
+Note: this environment is only supported on Linux based OSes.
+Action Space¶
+Circuit Training represents the chip canvas as a grid. +The action space corresponds to the different locations that the next macro can +be placed onto the canvas without violating any hard constraints on density or +blockages. +At each step, the agent places a macro. Once all macros are placed, a +force-directed method is used to place clusters of standard cells.
+Observation Space¶
+The observation space encodes information about the partial placement of the +circuit. +This includes:
+-
+
current_node
: the current node to be placed, which is a single integer +ranging from 0 to 3499.
+fake_net_heatmap
: a fake net heatmap, which provides a continuous +representation of the heatmap with values between 0.0 and 1.0 across 16,384 +points.
+is_node_placed
: the placement status of nodes, a binary array of size 3500, +showing whether each node has been placed (1) or not (0).
+locations_x
: node locations in the x-axis, a continuous array of size 3500 +with values ranging from 0.0 to 1.0, representing the x-coordinates of the +nodes.
+locations_y
: node locations in the y-axis, similar to locations_x, but for +the y-coordinates.
+mask
: a mask, a binary array of size 16,384 indicating the validity or +usability of each point in the net heatmap.
+netlist_index
: a netlist index. This usually acts as a placeholder, and is +fixed at 0.
+
Rewards¶
+The reward is evaluated at the end of each episode. The placement cost binary is +used to calculate the reward based on proxy wirelength, congestion, and density. +An infeasible placement results in a reward of -1.0.
The reward function is defined as:
Where: - \(p\) represents the placement - \(g\) represents the netlist graph - \(\lambda\) is the congestion weight - \(\gamma\) is the density weight
-Default values in A2Perf: - The congestion weight \(\lambda\) is set to 0.01 - The density weight \(\gamma\) is set to 0.01 - The maximum density threshold is set to 0.6
-These default values are based on the methodology described in Mirhoseini et al. (2021).
+Where:
+-
+
\(p\) represents the placement
+\(g\) represents the netlist graph
+\(\lambda\) is the congestion weight
+\(\gamma\) is the density weight
+
Default values in A2Perf:
+-
+
The congestion weight \(\lambda\) is set to 0.01
+The density weight \(\gamma\) is set to 0.01
+The maximum density threshold is set to 0.6
+
These default values are based on the methodology described +in Mirhoseini et al. (2021).
Termination¶
-The episode is terminated once all macros have been placed on the canvas, then the final reward is calculated.
+Episode End¶
+The episode ends when all nodes have been placed.
Registered Configurations¶
+Termination¶
+The episode is terminated once all macros have been placed on the canvas, then +the final reward is calculated.
+Registered Configurations¶
CircuitTraining-Ariane-v0