A Python-based CPU simulator to understand the basic components and operations of a central processing unit.
This project implements a virtual CPU in Python to simulate the basic operations and components of a computer's central processing unit. The goal was to understand how CPUs work at a fundamental level by creating a working model with registers, memory, an arithmetic logic unit, a control unit, and I/O capabilities.
The CPU currently can execute programs provided as bytearrays. The next planned feature is an assembler to translate human-readable assembly code into the bytecode format.
The virtual CPU implements a simplified architecture with the following components:
- Arithmetic Logic Unit (ALU): Handles arithmetic and logical operations
- Control Unit (CU): Manages the fetch-decode-execute cycle
- Memory Controller: Manages access to RAM
- I/O Controller: Handles input and output operations
- Instruction Unit: Manages program flow and control instructions
- General Purpose Registers: R0, R1
- Output Register: R2
- Link Register: R3 (stores return addresses)
- Memory Byte Register: R4
- Program Counter: R5
- Program Base Address Register: R6 (holds the base address of the current program)
- Z: Zero flag - Set when a result is zero, used for conditional operations
- RAM with configurable size (default: 1024 bytes)
- Support for address-based memory access
The CPU now features a complete interrupt handling system:
- Interrupt Controller: Manages interrupt requests and context switching
- Socket-based Interface: Allows external programs to trigger interrupts while the CPU is running
- Context Preservation: Automatically saves and restores CPU state during interrupts
- Support for Multiple Programs: Load and execute multiple programs without rebooting the CPU
Command | Description |
---|---|
0x00 | Load a program into memory at a specified address |
0x01 | Execute a program at a specified address |
Opcode | Mnemonic | Assembly instruction Format | Description | Flags Affected |
---|---|---|---|---|
Control Operations | ||||
0x00 | NOP | NOP |
No operation, processor continues to next instruction | None |
0x01 | HLT | HLT |
Halt execution, go to idle state | None |
Data Operations | ||||
0x02 | MOV | MOV Rd, Op2 |
Copy Op2 (register or value) into destination register | None |
Branch Operations | ||||
0x03 | BEQ | BEQ [addr] |
Branch to address if Z flag is set (result was zero) | None |
0x04 | BNE | BNE [addr] |
Branch to address if Z flag is not set (result non-zero) | None |
0x05 | B | B [addr] |
Branch unconditionally to address | None |
0x06 | BL | BL [addr] |
Branch to address and store return address in R3 | None |
0x07 | BX | BX |
Branch to address stored in link register (R3) | None |
Arithmetic Operations | ||||
0x08 | ADD | ADD Rd, Rn, Op2 |
Rd = Rn + Op2 | Z |
0x09 | SUB | SUB Rd, Rn, Op2 |
Rd = Rn - Op2 | Z |
0x0A | MUL | MUL Rd, Rn, Op2 |
Rd = Rn * Op2 | Z |
0x0B | DIV | DIV Rd, Rn, Op2 |
Rd = Rn // Op2 (integer division) | Z |
0x0C | MOD | MOD Rd, Rn, Op2 |
Rd = Rn % Op2 (modulo) | Z |
Logical Operations | ||||
0x0D | AND | AND Rd, Rn, Op2 |
Rd = Rn & Op2 (bitwise AND) | Z |
0x0E | ORR | ORR Rd, Rn, Op2 |
Rd = Rn | Op2 (bitwise OR) | Z |
0x0F | XOR | XOR Rd, Rn, Op2 |
Rd = Rn ^ Op2 (bitwise XOR) | Z |
0x10 | NOT | NOT Rd, Op2 |
Rd = ~Op2 (bitwise NOT) | Z |
Shift Operations | ||||
0x11 | LSL | LSL Rd, Rn, Op2 |
Rd = Rn << Op2 (logical shift left) | Z |
0x12 | LSR | LSR Rd, Rn, Op2 |
Rd = Rn >> Op2 (logical shift right) | Z |
Compare Operations | ||||
0x13 | CMP | CMP Rn, Op2 |
Compare Rn with Op2 and set Z flag accordingly | Z |
Memory Operations | ||||
0x14 | LDR | LDR Rd, [addr] |
Load value from memory address into Rd | None |
0x15 | STR | STR Rd, [addr] |
Store value from Rd into memory address | None |
I/O Operations | ||||
0x16 | INP | INP Rd |
Read input from user and store in Rd | None |
0x17 | OUT | OUT Rd |
Output value from Rd to console | None |
0x18 | OUTC | OUTC Rd |
Output character from Rd to console | None |
Interrupt Operations | ||||
0xFF | IRET | IRET |
Return from interrupt, restoring previous CPU state | None |
Where:
Rd
= Destination registerRn
= First operand registerOp2
= Second operand (register or immediate value)[addr]
= Memory address (register or immediate value or label -> label is transformed by assembler to immediate value in Bytecode)Z
= Zero flag (set when result is zero)
Code | Type | Size (bytes) |
---|---|---|
0x00 | register | 1 |
0x01 | value | 2 |
Code | Register | Description |
---|---|---|
0x00 | R0 | General Purpose Register 0 |
0x01 | R1 | General Purpose Register 1 |
0x02 | R2 | Output Register |
0x03 | R3 | Link Register |
0x04 | R4 | Memory Byte Register |
0x05 | R5 | Program Counter |
0x06 | R6 | Program Base Address Register |
Each instruction is represented as a sequence of bytes in the following format:
Opcode [LastOperandType, Oberand1 [Operand2 [Operand3 ...]]]
Where:
- Opcode: 1 byte
- Last Operand Type: 1 byte
- Operands: 0-n bytes (depending on the instruction), using little-endian byte order
Relative Addressing: The CPU supports relative addressing for branch instructions. The address is calculated as an offset from the current Program Base Address Register (R6).
Best way to run and interact with the CPU is with two terminal windows. In one terminal, run the CPU simulator, and in the other, use the provided scripts to load and execute programs. You can for example use tmux to split the terminal window or open two separate terminal windows.
To run the CPU simulator, execute the following command:
python main.py
Programs can now be loaded and executed while the CPU is running using the provided scripts:
# Load a program into memory
./program_loader.sh <program_file.mem> <memory_address>
# Execute a program at a specified address
./program_starter.sh <memory_address>
Note: The memory address for programs should be equal or greater than 0x0A (10 in decimal). The Addresses 0x00 to 0x09 are reserved for the CPU's idle state.
For example:
# Load Hello World program at address 0x0A
./program_loader.sh hello_world.mem 0x0A
# Run the Hello World program
./program_starter.sh 0x0A
# Load Fibonacci sequence generator at address 0xA1
./program_loader.sh fibonacci.mem 0xA1
# Run the Fibonacci sequence generator
./program_starter.sh 0xA1
Note: Make the scripts executable before running them:
chmod +x program_loader.sh program_starter.sh
Here are some example programs written as commented bytearrays that can be loaded and executed by the CPU.
Note: To load these programs dynamically, you need to put the hex values without commas and without the comments, just separated by spaces and/or new lines (like this: 0x02 0x01 0x00 0x48 0x00 0x18 0x00 0x00 ...
), in a file with the .mem
extension (using the extension is optional, but it helps to identify the file type).
Hello World
This program outputs "Hello World" to the console.
hello_world = bytearray([
# Hello World program
0x02, 0x01, 0x00, 0x48, 0x00, # MOV R0, 72 (H)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x65, 0x00, # MOV R0, 101 (e)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x6c, 0x00, # MOV R0, 108 (l)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x6c, 0x00, # MOV R0, 108 (l)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x6f, 0x00, # MOV R0, 111 (o)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x20, 0x00, # MOV R0, 32 (space)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x57, 0x00, # MOV R0, 87 (W)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x6f, 0x00, # MOV R0, 111 (o)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x72, 0x00, # MOV R0, 114 (r)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x6c, 0x00, # MOV R0, 108 (l)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x64, 0x00, # MOV R0, 100 (d)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x0a, 0x00, # MOV R0, 10 (LF)
0x18, 0x00, 0x00, # OUTC R0
0xFF, # IRET
])
Add two numbers
This program adds two numbers and outputs the result.
add_two_numbers = bytearray(
[
# Example: Adding two numbers and outputting the result
0x02, 0x01, 0x00, 0x05, 0x00, # MOV R0, 5 # fmt: skip
0x02, 0x01, 0x01, 0x03, 0x00, # MOV R1, 3 # fmt: skip
0x08, 0x00, 0x00, 0x00, 0x01, # ADD R0, R0, R1 # fmt: skip
0x17, 0x00, 0x00, # OUT R0 # fmt: skip
0xFF, # IRET # fmt: skip
]
)
Adding numbers with user input
This program adds two numbers, gets a new number from the user, adds it to the sum, and outputs the result.
example_program_1 = bytearray(
[
# Example: Adding two numbers, getting a new number from the user, adding it to the sum and outputting the result
0x02, # PC: 0x0000: MOV
0x01, # PC: 0x0001: last_operand_type (value)
0x00, # PC: 0x0002: operand_1 (register 0)
0x01, # PC: 0x0003: second byte of value 1
0x00, # PC: 0x0004: first byte of value 1
0x02, # PC: 0x0005: MOV
0x01, # PC: 0x0006: last_operand_type (value)
0x01, # PC: 0x0007: operand_1 (register 1)
0x02, # PC: 0x0008: second byte of value 2
0x00, # PC: 0x0009: first byte of value 2
0x08, # PC: 0x000a: ADD
0x00, # PC: 0x000b: last_operand_type (register)
0x00, # PC: 0x000c: operand_1 (result register: register 0)
0x00, # PC: 0x000d: operand_2 (summand 1: register 0)
0x01, # PC: 0x000e: operand_3 (summand 2: register 1)
0x16, # PC: 0x000f: INP
0x00, # PC: 0x0010: last_operand_type (register)
0x01, # PC: 0x0011: operand_1 (register 1)
0x08, # PC: 0x0012: ADD
0x00, # PC: 0x0013: last_operand_type (register)
0x00, # PC: 0x0014: operand_1 (result register: register 0)
0x00, # PC: 0x0015: operand_2 (summand 1: register 0)
0x01, # PC: 0x0016: operand_3 (summand 2: register 1)
0x17, # PC: 0x0017: OUT
0x00, # PC: 0x0018: last_operand_type (register)
0x00, # PC: 0x0019: operand_1 (register 0)
0xFF, # PC: 0x001a: IRET
]
)
Check if one number is greater than the other
This program checks if a number is greater than another and outputs the result.
is_greater_than: bytearray = bytearray(
[
# Example: check if a is > than b
# INP R0 (a)
0x16, # PC: 0x0000: INP
0x00, # PC: 0x0001: last_operand_type (register)
0x00, # PC: 0x0002: operand_1 (register 0)
# INP R1 (b)
0x16, # PC: 0x0003: INP
0x00, # PC: 0x0004: last_operand_type (register)
0x01, # PC: 0x0005: operand_1 (register 1)
# SUB R0, R0, R1 (R0 = R0 - R1) (a - b)
0x09, # PC: 0x0006: SUB
0x00, # PC: 0x0007: last_operand_type (register)
0x00, # PC: 0x0008: operand_1 (register 0)
0x00, # PC: 0x0009: operand_2 (register 0)
0x01, # PC: 0x000a: operand_3 (register 1)
# AND R0, R0, 0b1000 0000 0000 0000 (R0 = R0 & 0b1000 0000 0000 0000) (check if the most significant bit of R0 is set to 1)
0x0D, # PC: 0x000b: AND
0x01, # PC: 0x000c: last_operand_type (value)
0x00, # PC: 0x000d: operand_1 (register 0)
0x00, # PC: 0x000e: operand_2 (register 0)
0x00, # PC: 0x000f: second byte of value 0b1000 0000 0000 0000
0x80, # PC: 0x0010: first byte of value 0b1000 0000 0000 0000
# LSR R0, R0, 15 (R0 = R0 >> 15) (shift the most significant bit to the least significant bit)
0x12, # PC: 0x0011: LSR
0x01, # PC: 0x0012: last_operand_type (value)
0x00, # PC: 0x0013: operand_1 (register 0)
0x00, # PC: 0x0014: operand_2 (register 0)
0x0f, # PC: 0x0015: second byte of value 15
0x00, # PC: 0x0016: first byte of value 15
# OUT R0 (if output is 1, a < b, if output is 0, a >= b)
0x17, # PC: 0x0017: OUT
0x00, # PC: 0x0018: last_operand_type (register)
0x00, # PC: 0x0019: operand_1 (register 0)
# IRET
0xFF, # PC: 0x001d: IRET
]
)
Fibonacci sequence generator
This program calculates and outputs the Fibonacci sequence up to a specified maximum value.
fibonacci: bytearray = bytearray(
[
# Example: Output all Fibonacci numbers up to user defined limit
# set_limit:
# INP R0
0x16, # PC: 0x0000: INP
0x00, # PC: 0x0001: last_operand_type (register)
0x00, # PC: 0x0002: operand_1 (register 0)
# STR R0, 0x80 (store the limit in memory)
0x15, # PC: 0x0003: STR
0x01, # PC: 0x0004: last_operand_type (value)
0x00, # PC: 0x0005: operand_1 (register 0)
0x80, # PC: 0x0006: second byte of address
0x00, # PC: 0x0007: first byte of address
# store_first_two_numbers:
# MOV R0, 0
0x02, # PC: 0x0008: MOV
0x01, # PC: 0x0009: last_operand_type (value)
0x00, # PC: 0x000a: operand_1 (register 0)
0x00, # PC: 0x000b: second byte of value 0
0x00, # PC: 0x000c: first byte of value 0
# STR R0, 0x82 (store the first number in memory)
0x15, # PC: 0x000d: STR
0x01, # PC: 0x000e: last_operand_type (value)
0x00, # PC: 0x000f: operand_1 (register 0)
0x82, # PC: 0x0010: second byte of address
0x00, # PC: 0x0011: first byte of address
# BL 0x2c (is_current_number_greater_than_limit)
0x06, # PC: 0x0012: BL
0x01, # PC: 0x0013: last_operand_type (value)
0x2C, # PC: 0x0014: second byte of address (is_current_number_greater_than_limit)
0x00, # PC: 0x0015: first byte of address (is_current_number_greater_than_limit)
# LDR R0, 0x82 (load the first number from memory)
0x14, # PC: 0x0016: LDR
0x01, # PC: 0x0017: last_operand_type (value)
0x00, # PC: 0x0018: operand_1 (register 0)
0x82, # PC: 0x0019: second byte of address
0x00, # PC: 0x001a: first byte of address
# OUT R0
0x17, # PC: 0x001b: OUT
0x00, # PC: 0x001c: last_operand_type (register)
0x00, # PC: 0x001d: operand_1 (register 0)
# MOV R0, 1
0x02, # PC: 0x001e: MOV
0x01, # PC: 0x001f: last_operand_type (value)
0x00, # PC: 0x0020: operand_1 (register 0)
0x01, # PC: 0x0021: second byte of value 1
0x00, # PC: 0x0022: first byte of value 1
# STR R0, 0x84 (store the second number in memory)
0x15, # PC: 0x0023: STR
0x01, # PC: 0x0024: last_operand_type (value)
0x00, # PC: 0x0025: operand_1 (register 0)
0x84, # PC: 0x0026: second byte of address
0x00, # PC: 0x0027: first byte of address
# B 0x47 (print_fibonacci_loop)
0x05, # PC: 0x0028: B
0x01, # PC: 0x0029: last_operand_type (value)
0x47, # PC: 0x002a: second byte of address (print_fibonacci_loop)
0x00, # PC: 0x002b: first byte of address (print_fibonacci_loop)
# is_current_number_greater_than_limit:
# LDR R1, 0x80 (load the limit from memory)
0x14, # PC: 0x002c: LDR
0x01, # PC: 0x002d: last_operand_type (value)
0x01, # PC: 0x002e: operand_1 (register 1)
0x80, # PC: 0x002f: second byte of address
0x00, # PC: 0x0030: first byte of address
# SUB R0, R0, R1 (R0 = R0 - R1) (current_number - limit)
0x09, # PC: 0x0031: SUB
0x00, # PC: 0x0032: last_operand_type (register)
0x00, # PC: 0x0033: operand_1 (register 0)
0x00, # PC: 0x0034: operand_2 (register 0)
0x01, # PC: 0x0035: operand_3 (register 1)
# AND R0, R0, 0b1000 0000 0000 0000 (R0 = R0 & 0b1000 0000 0000 0000) (extract the most significant bit)
0x0D, # PC: 0x0036: AND
0x01, # PC: 0x0037: last_operand_type (value)
0x00, # PC: 0x0038: operand_1 (register 0)
0x00, # PC: 0x0039: operand_2 (register 0)
0x00, # PC: 0x003a: second byte of value 0b1000 0000 0000 0000
0x80, # PC: 0x003b: first byte of value 0b1000 0000 0000 0000
# CMP R0, 0 (compare the result with 0. If the most significant bit is 1 (sign), the current number is less than the limit)
0x13, # PC: 0x003c: CMP
0x01, # PC: 0x003d: last_operand_type (value)
0x00, # PC: 0x003e: operand_1 (register 0)
0x00, # PC: 0x003f: second byte of value 0
0x00, # PC: 0x0040: first byte of value 0
# BNE 0x46 (if the current number is not greater than the limit, jump to jmp_back)
0x04, # PC: 0x0041: BNE
0x01, # PC: 0x0042: last_operand_type (value)
0x46, # PC: 0x0043: second byte of address (jmp_back)
0x00, # PC: 0x0044: first byte of address (jmp_back)
# IRET (if the current number is greater than the limit, halt the program)
0xFF, # PC: 0x0045: IRET
# jmp_back:
# BX (jump back, where came from)
0x07, # PC: 0x0046: BX
# print_fibonacci_loop:
# LDR R0, 0x82 (load the first number from memory)
0x14, # PC: 0x0047: LDR
0x01, # PC: 0x0048: last_operand_type (value)
0x00, # PC: 0x0049: operand_1 (register 0)
0x82, # PC: 0x004a: second byte of address
0x00, # PC: 0x004b: first byte of address
# LDR R1, 0x84 (load the second number from memory)
0x14, # PC: 0x004c: LDR
0x01, # PC: 0x004d: last_operand_type (value)
0x01, # PC: 0x004e: operand_1 (register 1)
0x84, # PC: 0x004f: second byte of address
0x00, # PC: 0x0050: first byte of address
# STR R0, 0x84 (store the first number in memory)
0x15, # PC: 0x0051: STR
0x01, # PC: 0x0052: last_operand_type (value)
0x00, # PC: 0x0053: operand_1 (register 0)
0x84, # PC: 0x0054: second byte of address
0x00, # PC: 0x0055: first byte of address
# ADD R0, R0, R1 (R0 = R0 + R1) (current_number = first_number + second_number)
0x08, # PC: 0x0056: ADD
0x00, # PC: 0x0057: last_operand_type (register)
0x00, # PC: 0x0058: operand_1 (register 0)
0x00, # PC: 0x0059: operand_2 (register 0)
0x01, # PC: 0x005a: operand_3 (register 1)
# STR R0, 0x82 (store the current number in memory)
0x15, # PC: 0x005b: STR
0x01, # PC: 0x005c: last_operand_type (value)
0x00, # PC: 0x005d: operand_1 (register 0)
0x82, # PC: 0x005e: second byte of address
0x00, # PC: 0x005f: first byte of address
# BL 0x2c (is_current_number_greater_than_limit)
0x06, # PC: 0x0060: BL
0x01, # PC: 0x0061: last_operand_type (value)
0x2C, # PC: 0x0062: second byte of address (is_current_number_greater_than_limit)
0x00, # PC: 0x0063: first byte of address (is_current_number_greater_than_limit)
# LDR R0, 0x82 (load the current number from memory)
0x14, # PC: 0x0064: LDR
0x01, # PC: 0x0065: last_operand_type (value)
0x00, # PC: 0x0066: operand_1 (register 0)
0x82, # PC: 0x0067: second byte of address
0x00, # PC: 0x0068: first byte of address
# OUT R0
0x17, # PC: 0x0069: OUT
0x00, # PC: 0x006a: last_operand_type (register)
0x00, # PC: 0x006b: operand_1 (register 0)
# B 0x47 (print_fibonacci_loop)
0x05, # PC: 0x006c: B
0x01, # PC: 0x006d: last_operand_type (value)
0x47, # PC: 0x006e: second byte of address (print_fibonacci_loop)
0x00, # PC: 0x006f: first byte of address (print_fibonacci_loop)
]
)
Cooperative Multitasking: Two programs jumping from one to the other
This program needs to load to address 0x0A
print_a: bytearray = bytearray(
[
0x02, 0x01, 0x00, 0x41, 0x00, # MOV R0, 65 (A)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x0a, 0x00, # MOV R0, 10 (LF)
0x18, 0x00, 0x00, # OUTC R0
0x05, 0x01, 0x96, 0x00, # B 0x96 (0xA0 minus relative address(Start address of programm a))
]
)
This program needs to be loaded to address 0xA0
print_b: bytearray = bytearray(
[
0x02, 0x01, 0x00, 0x42, 0x00, # MOV R0, 66 (B)
0x18, 0x00, 0x00, # OUTC R0
0x02, 0x01, 0x00, 0x0a, 0x00, # MOV R0, 10 (LF)
0x18, 0x00, 0x00, # OUTC R0
0x05, 0x01, 0x00, 0x00, # B 0x00 (0x0A minus relative address (Start address of program a))
]
)
- ✅ Loader: Implemented loader to load programs from files
- ✅ Interrupt Handling: Implemented a complete interrupt system
- Assembler: Create a tool to translate assembly language into bytecode
Contributions to this project are welcome! Whether you're a beginner or experienced developer, your input is valuable. Here are some ways to contribute:
- Bug reports: If you find an issue, please create a GitHub issue
- Feature suggestions: Have ideas for improvements? Let me know!
- Code contributions: Pull requests for new features or bug fixes are welcome
- Documentation: Help improve or extend the documentation
- Educational feedback: As this is a learning project, I appreciate feedback on the architecture and design
This project is available under the MIT License. See the LICENSE file for details.
This project was created as a personal learning exercise for me to understand CPU architecture. It's meant to be educational rather than a production-ready CPU emulator.