Skip to content

Syedhasan7/pes_asic_class

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 

Repository files navigation

VLSI Physical Design for ASICs

Objective

The objective of VLSI (Very Large Scale Integration) physical design for ASICs (Application-Specific Integrated Circuits) is to transform a logical design description (RTL - Register Transfer Level) into a physical layout that can be fabricated as an integrated circuit. This involves translating the high-level functional representation of the circuit into a physical implementation that meets design constraints, performance targets, and manufacturability requirements.

SKILL OUTCOMES

  • Architectural Design
  • RTL Design / Behavioral Modeling
  • Floorplanning
  • placement
  • clock Tree Synthesis
  • Routing

INSTALLATION

Riscv_toolchain Installation

https://github.com/kunalg123/riscv_workshop_collaterals/blob/master/run.sh

  • Download the run.sh

  • Open terminal

  • cd Downloads

  • ./run.sh

  • If permission denied, then

    • chmod +x run.sh
    • ./run.sh
  • If error in configure,

    image
    • cd
    • cd riscv_toolchain/iverilog
    • ./configure
    • make
    • sudo make install
  • To check if riscv-gcc compiler is in the path,

    • gedit ~/.bashrc

    • insert this in the bash file if not present:

      export PATH=~/riscv_toolchain/riscv64-unknown-elf-gcc-8.3.0-2019.08.0-x86_64-linux-ubuntu14/bin:$PATH

    End of installation

Yosys with GTKwave Installation
  • cd

  • git clone https://github.com/YosysHQ/yosys.git

  • cd yosys

  • sudo apt install make

  • sudo apt-get update

  • sudo apt-get install build-essential clang bison flex libreadline-dev gawk tcl-dev libffi-dev git graphviz xdot pkg-config python3 libboost-system-dev libboost-python-dev libboost-filesystem-dev zlib1g-dev

  • make config-gcc

  • make

  • sudo make install

  • sudo apt install gtkwave

  • Type yosys

    image

    If received as shown above, installation is successful.

    End of installation

TABLE OF CONTENTS

DAY 1

Introduction to RISCV ISA and GNU Compiler Toolchain

DAY 2

Introduction to ABI and Basic Verification Flow

  • Application Binary Interface
    • Introduction to ABI
    • Memory Allocation for Double Words
    • Load, Add and Store Instructions
    • 32-Registers and their ABI Names
  • Labwork using ABI Function Calls
    • Algorithm for C Program using ASM
    • Review ASM Function Calls
    • Simulate C Program using Function Call
    • Lab to Run C-Program On RISCV-CPU

DAY 3

Introduction to Verilog RTL design and Synthesis

DAY 4

Timing Libs, Hierarchical vs Flat Synthesis and Efficient Flop Coding Styles

DAY 5

Combinational and Sequential Optmizations

DAY 6

GLS, Blocking vs Non-Blocking and Synthesis-Simulation Mismatch

Day-1

Introduction to Basic Keywords

Introduction
  • ISA (Instruction Set Archhitecture)

    • ISA defines the interface between a computer's hardware and its software, specifically how the processor and its components interact with the software instructions that drive the execution of tasks.
    • It encompasses a set of instructions, addressing modes, data types, registers, memory organization, and the mechanisms for executing and managing instructions.
  • RISC-V (Reduced Instruction Set Computing - Five).

    • It is an open-source Instruction Set Architecture (ISA) that has gained significant attention and adoption in the world of computer architecture and semiconductor design.
    • RISC architectures simplify the instruction set by focusing on a smaller set of instructions, each of which can be executed in a single clock cycle. This approach usually leads to faster execution of individual instructions.
image
From Apps to Hardware

From Apps to Hardware

  1. Apps: Application software, often referred to simply as "applications" or "apps," is a type of computer software that is designed to perform specific tasks or functions for end-users.

  2. System software: System software refers to a category of computer software that acts as an intermediary between the hardware components of a computer system and the user-facing application software. It provides essential services, manages hardware resources, and enables the execution of application programs. System software plays a critical role in maintaining the overall functionality, security, and performance of a computer system.'

  3. Operating System: The operating system is a fundamental piece of software that manages hardware resources and provides various services for both users and application programs. It controls tasks such as memory management, process scheduling, file system management, and user interface interaction. Examples of operating systems include Microsoft Windows, macOS, Linux, and Android.

  4. Compiler: A compiler is a type of software tool that translates high-level programming code written by developers into assembly-level language.

  5. Assembler: An assembler is a software tool that translates assembly language code into machine code or binary code that can be directly executed by a computer's processor.

  6. RTL: RTL serves as an abstraction level in the design process that represents the behavior of a digital circuit in terms of registers and the operations that transfer data between them.

  7. Hardware: Hardware refers to the physical components of a computer system or any electronic device. It encompasses all the tangible parts that make up a computing or electronic device and enable it to perform various tasks.

Detail Description of Course Content

Pseudo Instructions: Pseudo-instructions are used to simplify programming, improve code readability, and reduce the number of explicit instructions a programmer needs to write. They are especially useful for common programming patterns that involve multiple instructions. Ex: li, mv.

Base Integer Instructions: The term "base integer instructions" refers to the fundamental set of instructions that form the foundation for performing basic arithmetic, logical, and data movement operations. Ex: add, sub, and, or, xor, sll.

Multiply Extension Intructions: The RISC-V architecture includes a set of multiply and multiply-accumulate (MAC) extension instructions that enhance the instruction set to perform efficient multiplication and multiplication-accumulate operations. Ex: mul, mulh, mulhu, mulhsu.

Single and Double Precision Floating Point Extension: The RISC-V architecture includes floating-point extensions that provide support for both single-precision (32-bit) and double-precision (64-bit) floating-point arithmetic operations. These extensions are often referred to as the "F" and "D" extensions, respectively. Floating-point arithmetic is essential for handling real numbers with fractional parts and for performing accurate calculations involving decimal values.

Application Binary Interface: ABI stands for "Application Binary Interface." It is a set of rules and conventions that govern how software components interact with each other at the binary level. The ABI defines various aspects of program execution, including how function calls are made, how parameters are passed and returned, how memory is allocated and managed, and more.

Memory Allocation and Stack Pointer

  • Memory allocation refers to the process of assigning and managing memory segments for various data structures, variables, and objects used by a program. It involves allocating memory space from the system's memory pool and releasing it when it is no longer needed to prevent memory leaks.
  • The stack pointer is a register used by a program to keep track of the current position of the program's execution on the call stack.

Labwork for RISCV Toolchain

C Program
  • We wrote a C program for calculating the sum from 1 to n using a text editor, leafpad.

leafpad sumton.c

#include<stdio.h>

int main(){
	int i, sum=0, n=111;
	for (i=1;i<=n; ++i) {
	sum +=i;
	}
	printf("Sum of numbers from 1 to %d is %d \n",n,sum);
	return 0;
}

Screenshot from 2023-08-20 19-35-18 Using the gcc compiler, we compiled the program to get the output.

gcc sumton.c .\a.out

Screenshot from 2023-08-20 19-35-33

RISCV GCC Compiler and Dissemble

  • Using the riscv gcc compiler, we compiled the C program.

riscv64-unknown-elf-gcc -O1 -mabi=lp64 -march=rv64i -o sum1ton.o sum1ton.c

  • Using ls -ltr sum1ton.c, we can check that the object file is created.

  • To get the dissembled ALP code for the C program,

riscv64-unknown-elf-objdump -d sum1ton.o | less .

  • In order to view the main section, type /main.

  • Here, since we used -O1 optimisation, the number of instructions are 15.

image
  • When we use -Ofast optimisation, we can see that the number of instructions have been reduced to 12.
image
  • -Onumber : level of optimisation required
  • -mabi : specifies the ABI (Application Binary Interface) to be used during code generation according to the requirements
  • -march : specifies target architecture
  • In order to view the different options available for these fields, use the following commands

go to the directory where riscv64-unkonwn-elf is present

  • -O1 : riscv64-unkonwn-elf --help=optimizer
  • -mabi : riscv64-unknown-elf-gcc --target-help
  • -march : riscv64-unknown-elf-gcc --target-help
  • To quit:
    • use esc :q to quit
Spike Simulation and Debug
  • spike pk sum1ton.o is used to check whether the instructions produced are right to give the correct output.

Screenshot from 2023-08-20 20-00-48

  • spike -d pk sum1ton.c is used for debugging.

  • The contents of the registers can also be viewed.

Screenshot from 2023-08-20 20-10-09

  • press ENTER : to show the first line and successive ENTER to show successive lines
  • reg 0 a2 : to check content of register a2 0th core
  • q : to quit the debug process

Integer Number Representation

Unsigned Numbers
  • Unsigned numbers, also known as non-negative numbers, are numerical values that represent magnitudes without indicating direction or sign.
  • Range: [0, (2^n)-1 ]
Signed Numbers
  • Signed numbers are numerical values that can represent both positive and negative magnitudes, along with zero.
  • Range :
    • Positive : [0 , 2^(n-1)-1]
    • Negative : [-1 to 2^(n-1)]
Labwork
  • We wrote a C program that shows the maximum and minimum values of 64bit unsigned numbers.
#include <stdio.h>
#include <math.h>

int main(){
	unsigned long long int max = (unsigned long long int) (pow(2,64) -1);
	unsigned long long int min = (unsigned long long int) (pow(2,64) *(-1));
	printf("lowest number represented by unsigned 64-bit integer is %llu\n",min);
	printf("highest number represented by unsigned 64-bit integer is %llu\n",max);
	return 0;
}

Screenshot from 2023-08-20 20-18-42

  • We wrote a C program that shows the maximum and minimum values of 64bit signed numbers.
#include <stdio.h>
#include <math.h>

int main(){
	long long int max = (long long int) (pow(2,63) -1);
	long long int min = (long long int) (pow(2,63) *(-1));
	printf("lowest number represented by signed 64-bit integer is %lld\n",min);
	printf("highest number represented by signed 64-bit integer is %lld\n",max);
	return 0;
}

Screenshot from 2023-08-20 20-21-33

Day-2

Application Binary Interface

Introduction to ABI
  • An Application Binary Interface (ABI) is a set of rules and conventions that dictate how binary code interacts with and communicates with other binary code, typically at the level of machine code or compiled code. In simpler terms, it defines the interface between two software components or systems that are written in different programming languages, compiled by different compilers, or running on different hardware architectures.
  • The ABI is crucial for enabling interoperability between different software components, such as different libraries, object files, or even entire programs. It allows components compiled independently and potentially on different platforms to work seamlessly together by adhering to a common set of rules for communication and data representation.
Memory Allocation for Double Words

64-bit number (or any multi-byte value) can be loaded into memory in little-endian or big-endian. It involves understanding the byte order and arranging the bytes accordingly

  1. Little-Endian: In little-endian representation, you store the least significant byte (LSB) at the lowest memory address and the most significant byte (MSB) at the highest memory address.
  2. Big-Endian: In big-endian representation, you store the most significant byte (MSB) at the lowest memory address and the least significant byte (LSB) at the highest memory address.

For example, consider the 64-bit hexadecimal value 0x0123456789ABCDEF.

In Little-Endian representation, it would be stored as follows in memory:

image

In Big-Endian representation, it would be stored as follows in memory:

image
Load, Add and Store instructions Load, Add, and Store instructions are fundamental operations in computer architecture and assembly programming. They are often used to manipulate data within a computer's memory and registers. 1. **Load Instructions:** Load instructions are used to transfer data from memory to registers. They allow you to fetch data from a specified memory address and place it into a register for further processing.

Example ld x6, 8(x5)

In this Example

  • ld is the load double-word instruction.
  • x6 is the destination register.
  • 8(x5) is the memory address pointed to by register x5 (base address + offset).
  1. Store Instructions: Store instructions are used to write data from registers into memory.They store values from registers into memory addresses

Example sd x8, 8(x9)

In this Example

  • sd is the store double-word instruction.
  • x8 is the source register.
  • 8(x9) is the memory address pointed to by register x9 (base address + offset).
  1. Add Instructions: Add instructions are used to perform addition operations on registers. They add the values of two source registers and store the result in a destination register.

Example add x9, x10, x11

In this Example

  • add is the add instruction.
  • x9 is the destination register.
  • x10 and x11 are the source registers.
32-Registers and their ABI Names The choice of the number of registers in a processor's architecture, such as the RISC-V RV64 architecture with its 32 general-purpose registers, involves a trade-off between various factors. While modern processors can have more registers but increasing the number of registers could lead to larger instructions, which would take up more memory and potentially slow down instruction fetch and decode. #### ABI Names ABI names for registers serve as a standardized way to designate the purpose and usage of specific registers within a software ecosystem. These names play a critical role in maintaining compatibility, optimizing code generation, and facilitating communication between different software components. image

Labwork using ABI Function Calls

Algorithm for C Program using ASM
  • Incorporating assembly language code into a C program can be done using inline assembly or by linking separate assembly files with your C code.
  • When you call an assembly function from your C code, the C calling convention is followed, including pushing arguments onto the stack or passing them in registers as required.
  • The program executes the assembly function, following the assembly instructions you've provided.
Review ASM Function Calls
  • We wrote C code in one file and your assembly code in a separate file.
  • In the assembly file, we declared assembly functions with appropriate signatures that match the calling conventions of your platform.

C Program custom1to9.c

#include <stdio.h>

extern int load(int x, int y);

int main()
{
  int result = 0;
  int count = 9;
  result = load(0x0, count+1);
  printf("Sum of numbers from 1 to 9 is %d\n", result);
}

Asseembly File load.s

.section .text
.global load
.type load, @function

load:

add a4, a0, zero
add a2, a0, a1
add a3, a0, zero

loop:

add a4, a3, a4
addi a3, a3, 1
blt a3, a2, loop
add a0, a4, zero
ret
Simulate C Program using Function Call
  • Compilation: To compile C code and Asseembly file use the command

riscv64-unknown-elf-gcc -O1 -mabi=lp64 -march=rv64i -o custom1to9.o custom1to9.c load.s

this would generate object file custom1to9.o.

  • Execution: To execute the object file run the command

spike pk custom1to9.o

Screenshot from 2023-08-20 20-29-14

Lab to Run C-Program on RICV-CPU
  • git clone https://github.com/kunalg123/riscv_workshop_collaterals.git

  • cd riscv_workshop_collaterals

  • ls -ltr

  • cd labs

  • ls -ltr

  • chmod 777 rv32im.sh

  • ./rv32im.sh

Day-3

Introduction to Open-Source Simulator iVerilog

Introduction to iVerilog Design Testbench
  • Simulator

    • It is a tool used for simulating the design. It looks for the changes on the input signals to evaluate the outputs.
    • If there is no change in the inputs, the simulator doesn't evaluate the outputs.
    • RTL is checked for adherence to the spec by simulating the design.
    • The tool used here is iverilog .
  • iVerilog

    • It is an open-source Verilog simulator used for testing and simulating digital circuit designs described in the Verilog hardware description language (HDL).
    • Both the design and the testbench are fed to the simulator and it produces a vcd (value change dump) file.
    • In order to view the vcd file, we use the GTKwave where we can see the wave forms.
    image
  • Design

    • It is the actual verilog code or set of verilog codes which ahs the intended functionality to meet with the required specifications.
    • Verilog is used to describe the behavior and structure of digital circuits at different levels of abstraction, from high-level system descriptions down to low-level gate-level representations.
  • Testbench

    • A testbench is a specialized Verilog module or program used to verify the functionality and behavior of another Verilog module, circuit, or design. Testbenches are essential for testing and simulating digital designs before they are synthesized or manufactured as physical chips.

    • It is a setup to apply stimulus to the design to check its functionality.

      image

Labs using iVerilog and GTKwave

Introduction to Lab
  • Make a directory named vsd mkdir vsd.

  • cd vsd.

  • git clone https://github.com/kunalg123/sky130RTLDesignAndSynthesisWorkshop.git

  • Creates a folder called sky130RTLDesignAndSynthesisWorkshop in the vsd directory.

    • my_lib : contains all the library files

    • lib : contains sky130 standard cell library used for our synthesis

    • verilog_model : contains all the standard cell verilog modules of the standard cells contained in the .lib

    • verilog_files : contains all the verilog source files and testbench files which are required for labs

iVerilog GTKwave Part-1
  • cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files

  • we have loaded the source code along with the testbench code into the iverilog simulator

  • iverilog good_mux.v tb_good_mux.v

  • We can see that an output file a.out has been created.

  • ./a.out

  • The output of the iverilog, a vcd file, is created which is loaded into the simualtor gtkwave.

  • gtkwave tb_good_mux.vcd

Screenshot from 2023-08-27 15-49-48
Screenshot from 2023-08-27 15-50-27

iVerilog GTKwave Part-2
  • In order to view the contents in the files,

  • gvim tb_good_mux.v -o good_mux.v

image

Introduction to Yosys and Logic Synthesis

Introduction to Yosys
  • Synthesizer

    • It is a tool used for converting RTL design code to netlist.
    • Here, the synthesizer used is Yosys.
  • Yosys

    • It is an open-source framework for Verilog RTL synthesis and formal verification.
    • Yosys provides a collection of tools and algorithms that enable designers to transform high-level RTL (Register Transfer Level) descriptions of digital circuits into optimized gate-level representations suitable for physical implementation on hardware.
image
  • Design and .lib files are fed to the synthesizer to get a netlist file.
  • Netlist is the representation of the design in the form of standard cells in the .lib
  • Commands used to perform different opertions:

    • read_verilog to read the design
    • read_liberty to read the .lib file
    • write_verilog to write out the netlist file
  • To verify the synthesis

image
  • Netlist along with the tesbench is fed to the iverilog simulator.
  • The vcd file generated is fed to the gtkwave simulator.
  • The output on the simulator must be same as the output observed during RTL simulation.
  • Same RTL testbench can be used as the primary inputs and primary outputs remain same between the RTL design and synthesised netlist.
Introduction to Logic Synthesis
  • Logic Synthesis

    • Logic synthesis is a process in digital design that transforms a high-level hardware description of a digital circuit, typically in a hardware description language (HDL) like Verilog or VHDL, into a lower-level representation composed of logic gates and flip-flops.
    • The goal of logic synthesis is to optimize the design for various criteria such as performance, area, power consumption, and timing.
  • .lib

    • It is a collection of logical modules like And, Or, Not etc.
    • It has different flavors of same gate like 2 input AND gate, 3 input AND gate etc with different performace speed.
  • Why different flavors of gate?

    • In order to make a circuit faster, the clock frequency should be high.
    • For that, the time period of the clock should be as low as possible.
image
  • In a sequential circuit, clock period depends on:
    • Clock to Q of flip-flop A.
    • Propagation delay of combinational circuit.
    • Setup time of flip-flop B.
image
  • Why need fast and slow cells?

    • To ensure that there are no HOLD issues at flip-flop B, we require slow cells.
    • For a smaller propagation time, we need faster cells.
    • The collection forms the .lib
  • Faster Cells vs Slower Cells

    • Load in digital circuit is of Capacitence.
    • Faster the charging or dicharging of capacitance, lesser is the cell delay.
    • However, for a quick charge/ discharge of capacitor, we need transistors capable of sourcing more current i.e, we need wide transistors.
    • Wider transistors have lesser delay but consume more area and power.
    • Narrow transistors have more delay but consume less area and performance.
    • Faster cells come with a cost of area and power.
  • Selection of the Cells

    • We have to guide the Synthesizer to choose the flavour of cells that is optimum for implementation of logic circuit.
    • More use of faster cells leads to bad circuit in terms of power and area and also hold time violations.
    • More use of slower cells leads to sluggish circuits amd may not meet the performance needs.
    • Hence the guidance is offered to the synthesiser in the form of constraints.

Labs using Yosys and Sky130 PDKs

Yosys good_mux
  • To invoke yosys
    • cd
    • cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
    • Type yosys

Screenshot from 2023-08-28 11-29-20

  • To read the library

    read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

  • To read the design

    read_verilog good_mux.v

  • To syntheis the module

    synth -top good_mux

Screenshot from 2023-08-28 11-37-05

  • To generate the netlist

abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

Screenshot from 2023-08-28 11-39-10

It gives a report of what cells are used and the number of input and output signals.

  • To see the logic realised

    show Screenshot from 2023-08-28 11-41-30

    The mux is completely realised in the form of sky130 library cells.

  • To write the netlist

    • write_verilog good_mux_netlist.v

    • !gvim good_mux_netlist.v

    • To view a simplified code

      write_verilog -noattr good_mux_netlist.v

      !gvim good_mux_netlist.v

Screenshot from 2023-08-28 11-45-47 Screenshot from 2023-08-28 11-46-21

Day 4

Introduction to Timing Dot Libs

Introduction to Dot Lib
  • To view the contents in the .lib

    gvim ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

    image
    • The first line in the file library ("sky130_fd_sc_hd__tt_025C_1v80") :

      • tt : indicates variations due to process and here it indicates Typical Process.
      • 025C : indicates the variations due to temperatures where the silicon will be used.
      • 1v80 : indicates the variations due to the voltage levels where the silicon will be incorporated.
  • It also displays the units of various parameters.

    image image
  • It gives the features of the cells

  • To enable line number :se nu

  • To view all the cells :g//

  • To view any instance :/instance

  • Since there are 5 inputs, for all the 32 possible combinations, it gives the delay, power and all the other parameters for each cell.

  • The below image shows the power consumption and area comparision.

Screenshot from 2023-08-28 14-40-20

Hierarchical vs Flat Synthesis

Hierarchical Synthesis Flat Synthesis

Hierarchical Synthesis Hierarchical synthesis is an approach in digital design and logic synthesis where complex designs are broken down into smaller, more manageable modules or sub-circuits, and each module is synthesized individually. These synthesized modules are then integrated back into the overall design hierarchy. This approach helps manage the complexity of large designs and allows designers to work on different parts of the design independently.

  • The file we used in this lab is multiple_modules.v

    • cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
    • gvim multiple_modules.v

Screenshot from 2023-08-28 14-42-30

  • multiple_modules instantiates sub_module1 and sub_module2

  • Launch yosys

  • read the library file read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

  • read the verilog file read_verilog multiple_modules.v

  • synth -top multiple_modules to set it as top module

image

image
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

  • To view the netlist show multiple_modules

    image
  • Here it shows sub_module1 and sub_module2 instead of AND gate and OR gate.
  • write_verilog -noattr multiple_modules_hier.v
  • !gvim multiple_modules_hier.v
image image

Flattened Synthesis Flattened synthesis is the opposite of hierarchical synthesis. Instead of maintaining the hierarchical structure of the design during synthesis, flattened synthesis combines all modules and sub-modules into a single, flat representation. This means that the entire design is synthesized as a single unit, without preserving the modular organization present in the original high-level description.

  • Launch yosys
  • read the library file read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read the verilog file read_verilog multiple_modules.v
  • synth -top multiple_modules to set it as top module
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • flatten to write out a flattened netlist
  • show
image
  • write_verilog -noattr multiple_modules_flat.v
  • !gvim multiple_modules_flat.v
image image

Various Flop Coding Styles and Optimization

Why Flops and Flop Coding Styles

Why do we need a Flop?

  • A flip-flop (often abbreviated as "flop") is a fundamental building block in digital circuit design.
  • It's a type of sequential logic element that stores binary information (0 or 1) and can change its output based on clock signals and input values.
  • In a combinational circuit, the output changes after the propagation delay of the circuit once inputs are changed.
  • During the propagation of data, if there are different paths with different propagation delays, then a glitch might occur.
  • There will be multiple glitches for multiple combinational circuits.
  • Hence, we need flops to store the data from the combinational circuits.
  • When a flop is used, the output of combinational circuit is stored in it and it is propagated only at the posedge or negedge of the clock so that the next combinational circuit gets a glitch free input thereby stabilising the output.
  • We use control pins like set and reset to initialise the flops.
  • They can be synchronous and asynchronous.

D Flip-Flop with Asynchronous Reset

  • When the reset is high, the output of the flip-flop is forced to 0, irrespective of the clock signal.
  • Else, on the positive edge of the clock, the stored value is updated at the output.

gvim dff_asyncres_syncres.v

image

D Flip_Flop with Asynchronous Set

  • When the set is high, the output of the flip-flop is forced to 1, irrespective of the clock signal.
  • Else, on positive edge of the clock, the stored value is updated at the output.

gvim dff_async_set.v

image

D Flip-Flop with Synchronous Reset

  • When the reset is high on the positive edge of the clock, the output of the flip-flop is forced to 0.

  • Else, on the positive edge of the clock, the stored value is updated at the output.

    gvim dff_syncres.v

image

D Flip-Flop with Asynchronous Reset and Synchronous Reset

  • When the asynchronous resest is high, the output is forced to 0.
  • When the synchronous reset is high at the positive edge of the clock, the output is forced to 0.
  • Else, on the positive edge of the clock, the stored value is updated at the output.
  • Here, it is a combination of both synchronous and asynchronous reset DFF.

gvim dff_asyncres_syncres.v

image
Lab Flop Synthesis Simulations

D Flip-Flop with Asynchronous Reset

  • Simulation

    • cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
    • iverilog dff_asyncres.v tb_dff_asyncres.v
    • ./a.out
    • gtkwave tb_dff_asyncres.vcd

    Screenshot from 2023-08-28 15-14-11

  • Synthesis

    • cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files

    • yosys

    • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

    • read_verilog dff_asyncres.v

    • synth -top dff_asyncres

    • dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

    • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

    • show

      image

D Flip_Flop with Asynchronous Set

  • Simulation
    • cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
    • iverilog dff_async_set.v tb_dff_async_set.v
    • ./a.out
    • gtkwave tb_dff_async_set.vcd

Screenshot from 2023-08-28 15-20-36

  • Synthesis
    • cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
    • yosys
    • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
    • read_verilog dff_async_set.v
    • synth -top dff_async_set
    • dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
    • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
    • show
image

D Flip-Flop with Synchronous Reset

  • Simulation

    • cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
    • iverilog dff_syncres.v tb_dff_syncres.v
    • ./a.out
    • gtkwave tb_dff_syncres.vcd
    image
  • Synthesis

    • cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
    • yosys
    • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
    • read_verilog dff_syncres.v
    • synth -top dff_syncres
    • dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
    • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
    • show
image
Interesting Optimisations
  • gvim mult_2.v
image
  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog mult_2.v
  • synth -top mul2
image
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image
  • write_verilog -noattr mul2_netlist.v
  • !gvim mul2_netlist.v
image
  • gvim mult_8.v

    image
  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

  • read_verilog mult_8.v

  • synth -top mult8

image
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image
  • write_verilog -noattr mult8_netlist.v
  • !gvim mult8_netlist.v
image

Day 5

Introduction to Optimisations

Combinational Optimisation
  • Combinational logic refers to logic circuits where the outputs depend only on the current inputs and not on any previous states.
  • Combinational optimization is a field of study in computer science and operations research that focuses on finding the best possible solution from a finite set of options for problems that involve discrete variables and have no inherent notion of time.
  • Optimising the combinational logic circuit is squeezing the logic to get the most optimized digital design so that the circuit finally is area and power efficient.
  • Techniques for Optimisations:
    • Constant propagation is an optimization technique used in compiler design and digital circuit synthesis to improve the efficiency of code and circuit implementations by replacing variables or expressions with their constant values where applicable.
    • Boolean logic optimization, also known as logic minimization or Boolean function simplification, is a process in digital design that aims to simplify Boolean expressions or logic circuits by reducing the number of terms, literals, and gates required to implement a given logical function.
Sequential Logic Optimisations
  • Sequential logic optimizations involve improving the efficiency, performance, and resource utilization of digital circuits that include memory elements like flip-flops and latches.
  • Optimizing sequential logic is crucial in ensuring that digital circuits meet timing requirements, consume minimal power, and occupy the least possible area while maintaining correct functionality.
  • Optimisation methods:
    • Sequential constant propagation, also known as constant propagation across sequential elements, is an optimization technique used in digital design to identify and propagate constant values through sequential logic elements like flip-flops and registers. This technique aims to replace variable values with their known constant values at various stages of the logic circuit, optimizing the design for better performance and resource utilization.
    • State optimization, also known as state minimization or state reduction, is an optimization technique used in digital design to reduce the number of states in finite state machines (FSMs) while preserving the original functionality.
    • Sequential logic cloning, also known as retiming-based cloning or register cloning, is a technique used in digital design to improve the performance of a circuit by duplicating or cloning existing registers (flip-flops) and introducing additional pipeline stages. This technique aims to balance the critical paths within a circuit and reduce its overall clock period, leading to improved timing performance and better overall efficiency.
    • Retiming is an optimization technique used in digital design to improve the performance of a circuit by repositioning registers (flip-flops) along its paths to balance the timing and reduce the critical path delay. The primary goal of retiming is to achieve a shorter clock period without changing the functionality of the circuit.

Combinational Logic Optimisations

opt_check
  • gvim opt_check.v

    image
  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

  • read_verilog opt_check.v

  • synth -top opt_check

  • opt_clean -purge

  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

  • show

    image image
opt_check2
  • gvim opt_check2.v

    image
  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

  • read_verilog opt_check2.v

  • synth -top opt_check2

  • opt_clean -purge

  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib

  • show

image image
opt_check3
  • gvim opt_check3.v
image
  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog opt_check3.v
  • synth -top opt_check3
  • opt_clean -purge
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image image
opt_check4
  • gvim opt_check4.v
image
  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog opt_check4.v
  • synth -top opt_check4
  • opt_clean -purge
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image image
multiple_module_opt
  • gvim multiple_module_opt.v
image
  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog multiple_module_opt.v
  • synth -top multiple_module_opt
  • opt_clean -purge
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image image

Sequential Logic Optimisations

dff_const1
  • gvim dff_const1.v
image

Simulation

  • iverilog dff_const1.v tb_dff_const1.v
  • /a.out
  • gtkwave tb_dff_const1.vcd

Screenshot from 2023-08-29 07-46-04

Synthesis

  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog dff_const1.v
  • synth -top dff_const1
  • dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image image
dff_const2
  • gvim dff_const2.v
image

Simulation

  • iverilog dff_const2.v tb_dff_const2.v
  • /a.out
  • gtkwave tb_dff_const2.vcd

Screenshot from 2023-08-29 07-47-58

Synthesis

  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog dff_const2.v
  • synth -top dff_const2
  • dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image image
dff_const3
  • gvim dff_const3.v
image

Simulation

  • iverilog dff_const3.v tb_dff_const3.v
  • /a.out
  • gtkwave tb_dff_const3.vcd

Screenshot from 2023-08-29 07-59-49

Synthesis

  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog dff_const3.v
  • synth -top dff_const3
  • dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image image
dff_const4
  • gvim dff_const4.v
image

Simulation

  • iverilog dff_const4.v tb_dff_const4.v
  • /a.out
  • gtkwave tb_dff_const4.vcd

Screenshot from 2023-08-29 08-03-49

Synthesis

  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog dff_const4.v
  • synth -top dff_const4
  • dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image image
dff_const5
  • gvim dff_const5.v
image

Simulation

  • iverilog dff_const4.v tb_dff_const4.v
  • /a.out
  • gtkwave tb_dff_const5.vcd

Screenshot from 2023-08-29 08-04-18

Synthesis

  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog dff_const4.v
  • synth -top dff_const4
  • dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image image

Sequential Optimisations for Unused Outputs

counter_opt
  • gvim counter_opt.v
image
  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog counter_opt.v
  • synth -top counter_opt
  • dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image image
counter_opt2
  • gvim counter_opt2.v
image
  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog counter_opt2.v
  • synth -top counter_opt2
  • dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image image

Day 6

GLS Synthesis-Simulation Mismatch and Blocking Non-blocking Statements

GLS Concepts And Flow Using Iverilog
  • Gate Level Simualtion
    • Gate-level simulation is a technique used in digital design and verification to validate the functionality of a digital circuit at the gate-level implementation.
    • It involves simulating the circuit using the actual logic gates and flip-flops that make up the design, as opposed to higher-level abstractions like RTL (Register Transfer Level) descriptions.
    • This type of simulation is typically performed after the logic synthesis process, where a high-level description of the design is transformed into a netlist of gates and flip-flops.
    • We perform this to verify logical correctness of the design after synthesizing it. Also ensuring the timing of the design is met.
image
  • Synthesis-Simulation Mismatch

    • A synthesis-simulation mismatch refers to a situation in digital design where the behavior of a circuit, as observed during simulation, doesn't match the expected or desired behavior of the circuit after it has been synthesized.
    • This discrepancy can occur due to various reasons, such as timing issues, optimization conflicts, and differences in modeling between the simulation and synthesis tools.
    • This mismatch is a critical concern in digital design because it indicates that the actual hardware implementation might not perform as expected, potentially leading to functional or timing failures in the fabricated chip.
  • Blocking Statements

    • Blocking statements are executed sequentially in the order they appear in the code and have an immediate effect on signal assignments.
    • Example:
     module BlockingExample(input A, input B, input C, output Y, output Z);
      wire temp;
    
      // Blocking assignment
      assign temp = A & B;
    
      always @(posedge C) begin
          // Blocking assignment
          Y = temp;
          Z = ~temp;
      end
     endmodule
  • Non-Blocking Statements

    • Non-blocking assignments are used to model concurrent signal updates, where all assignments are evaluated simultaneously and then scheduled to be updated at the end of the time step.
    • Example:
     module NonBlockingExample(input clock, input D, input reset, output reg Q);
    
     always @(posedge clock or posedge reset) begin
         if (reset)
             Q <= 0;  // Reset the flip-flop
         else
             Q <= D;  // Non-blocking assignment to update Q with D on clock edge
     end
    endmodule
  • Caveats with Blocking Statements

    • Blocking statements in hardware description languages like Verilog have their uses, but there are certain caveats and considerations to be aware of when working with them. Here are some important caveats associated with using blocking statements:
      • Procedural Execution: Blocking statements are executed sequentially in the order they appear within a procedural block (such as an always block). This can lead to unexpected behavior if the order of execution matters and is not well understood.
      • Lack of Parallelism: Blocking statements do not accurately represent the parallel nature of hardware. In hardware, multiple signals can update concurrently, but blocking statements model sequential behavior. As a result, using blocking statements for modeling complex concurrent logic can lead to incorrect simulations.
      • Race Conditions: When multiple blocking assignments operate on the same signal within the same procedural block, a race condition can occur. The outcome of such assignments depends on their order of execution, which might lead to inconsistent or unpredictable behavior.
      • Limited Representation of Hardware: Hardware systems are inherently concurrent and parallel, but blocking statements do not capture this aspect effectively. Using blocking assignments to model complex combinational or sequential logic can lead to models that are difficult to understand, maintain, and debug.
      • Combinatorial Loops: Incorrect use of blocking statements can lead to unintentional combinational logic loops, which can result in simulation or synthesis errors.
      • Debugging Challenges: Debugging code with many blocking assignments can be challenging, especially when trying to track down timing-related issues.
      • Not Suitable for Flip-Flops: Blocking assignments are not suitable for modeling flip-flop behavior. Non-blocking assignments (<=) are generally preferred for modeling flip-flop updates to ensure accurate representation of concurrent behavior.
      • Sequential Logic Misrepresentation: Using blocking assignments to model sequential logic might not capture the intended behavior accurately. Sequential elements like registers and flip-flops are better represented using non-blocking assignments.
      • Synthesis Implications: The behavior of blocking assignments might not translate well during synthesis, leading to potential mismatches between simulation and synthesis results.

Labs on GLS and Synthesis-Simulation Mismatch

ternary_operator_mux
  • gvim teranry_operator_mux.v
image

Simulation

  • iverilog ternary_operator_mux.v tb_ternary_operator_mux.v
  • ./a.out
  • gtkwave tb_ternary_operator_mux.vcd

Screenshot from 2023-08-29 08-27-14

Synthesis

  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog ternary_operator_mux.v
  • synth -top ternary_operator_mux
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image image

GLS to Gate-Level Simulation

  • iverilog ../my_lib/verilog_model/primitives.v ../my_lib/verilog_model/sky130_fd_sc_hd.v ternary_operator_mux_net.v tb_ternary_operator_mux.v
  • ./a.out
  • gtkwave tb_bad_mux.vcd

Screenshot from 2023-08-29 08-41-32

bad_mux
  • gvim bad_mux.v
image

Simualtion

  • iverilog bad_mux.v tb_bad_mux.v
  • ./a.out
  • gtkwave tb_bad_mux.vcd

Screenshot from 2023-08-29 08-56-27

Synthesis

  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog bad_mux.v
  • synth -top bad_mux
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show
image image

GLS to Gate-Level Simulation

  • iverilog ../my_lib/verilog_model/primitives.v ../my_lib/verilog_model/sky130_fd_sc_hd.v bad_mux_net.v tb_bad_mux.v
  • ./a.out
  • gtkwave tb_bad_mux.vcd

Screenshot from 2023-08-29 08-59-49

Labs on Synth-Sim Mismatch for Blocking Statement

blocking_caveat
  • gvim blocking_caveat.v
image

Simualtion

  • iverilog blocking_caveat.v tb_blocking_caveat.v
  • ./a.out
  • gtkwave tb_blocking_caveat.vcd

Screenshot from 2023-08-29 09-29-12

Synthesis

  • read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • read_verilog blocking_caveat.v
  • synth -top blocking_caveat
  • abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
  • show

Screenshot from 2023-08-29 09-29-57

Screenshot from 2023-08-29 09-30-20

GLS to Gate-Level Simulation

  • iverilog ../my_lib/verilog_model/primitives.v ../my_lib/verilog_model/sky130_fd_sc_hd.v blocking_caveat_net.v tb_blocking_caveat.v
  • ./a.out
  • gtkwave tb_blocking_caveat.vcd

Screenshot from 2023-08-29 09-31-25

About

Assignments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published