Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debugger proposal #43

Open
dxrcy opened this issue Sep 24, 2024 · 3 comments
Open

Debugger proposal #43

dxrcy opened this issue Sep 24, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@dxrcy
Copy link
Contributor

dxrcy commented Sep 24, 2024

Lace Debugger Proposal

The debugger can only be only ran on an assembly file; It cannot run directly on
an object file, as it won't have access to predefined breakpoints or label
names.

In debugger mode, the original object code is stored and not mutated. It is
restored to memory with the reset command, and lines can be displayed with the
source command. Note that the source file cannot be changed while running,
whether by the debugger or the user modifying the file; the program must be run
again.

All values will be printed as hex, signed decimal, unsigned decimal, binary, or
ASCII representation (if printable; common control characters can use escape
sequences).

All debug output will be printed to stderr.

Command-Line Options

The command-line subcommand lace debug behaves like lace run, except that
it enables the debugger for the program.

The options -m or --minimal will cause the debugger to print minimal output
(no color or fancy formatting). This will be useful for blackbox testing. For
example, set commands will only print a hex value, rather than multiple
representations.

The options -c or --command, followed by a string, will cause the
debugger to use the string argument as its input, instead of stdin. This can
allow a user to separate debugger commands from program input, which will be
useful for blackbox testing.

Command Input

Input is by default taken from stdin. Users can use up/down keys to navigate
command history. Stdin can be piped from a command or file to automate
debugging. See above for --command command-line option.

Characters are read from input until a newline, semicolon, or EOF is reached.
If a newline or semiclon is reached, the rest of the line will be read by the
parsing of the next command. Semicolons are used to separate commands as they
won't interfere with the syntax of the eval command.

If EOF is reached, the debugger treats this as a quit command, which stops
the debugger but continues execution. This will prevent the debugger from
blocking the normal program execution on the EOF of a piped stdin.

Command arguments are separated by spaces, without commas. Each command can have
optional operands, and some commands will warn if too many arguments were given.
Arguments can be integers, labels, register names, or --in the case of eval--
an arbitrary string including spaces.

Debugger Loop

In debugger mode, the progam will wait for a user command before executing each
instruction (unless currently continuing).

Before a HALT is executed, the debugger will prompt the user for a command.
The behaviour of some commands in this state is listed:

  • reset: the memory/registers will be reset, and the program will restart.

  • quit: the debugger will stop, and execution will continue without the
    debugger, resulting in the HALT being executed and the program exiting.

  • exit: The program will immediately exit, as it would on a HALT without the
    debugger.

  • continue/next/etc: The next instruction will try to be executed, but since
    it is a HALT instruction and the debugger is still active, it will result in
    no change and the debugger will again prompt the user for a command.

  • All other commands will behave predictably.

Commands

step COUNT=1 (t)
Step next instruction or INTO subroutine.

next COUNT=1 (n)
Step next instruction or OVER subroutine.

continue (c, cont)
Continue until breakpoint or HALT.

finish (f, fin)
Continue until end of subroutine, breakpoint, or HALT.

quit (q)
Stop debugger and continue execution as normal.

exit (e, ^C)
Exit debugger and simulator.

break list (b l, bl)
List breakpoints.

break add (ADDRESS|LABEL+)=PC (b a, ba)
Add breakpoint at an address/label/PC.

break remove (ADDRESS|LABEL+)=PC (b r, br)
Remove breakpoint at an address/label/PC.

get (REGISTER|ADDRESS+|LABEL+) (g)
Print the value at a register, address, or label.

set (REGISTER|ADDRESS|LABEL+) VALUE (s)
Set the value at a register, address, or label.

registers (r, reg)
Print the value of all registers.

reset (no alias)
Reset all memory and registers.

source COUNT=1 (ADDRESS|LABEL+)=PC (no alias)
Print corresponding line and line number of source code from address/label/PC.

eval OPCODE OPERANDS... (no alias)
Simulate an instruction. Note that labels cannot be created or modified.

Invalid command (h, help, *):
Show available commands.

  • TODO: not all commands have to be implemented.
  • TODO: command names can be changed.
  • TODO: resolve alias conflict between step and set

Argument Types

COUNT, ADDRESS, VALUE
An integer literal of any supported base. Signed or unsigned.

ADDRESS+
An address range. Eg. x3100..x3102.

LABEL+
A label name with an optional offset. Eg. Foo, Foo+2, Foo-2. Whitespace
cannot appear between the label name and the offset, or it will be treated as a
separate argument.

REGISTER
A register name: r*, pc, cc.

PC
The current program counter value. Used as a default value.

OPCODE OPERANDS...
An assembly instruction, with the same form as found in a source file.

Breakpoints

Breakpoints can be defined with the .BREAK pseudo-op, or at runtime with a
command. Both predefined and runtime-defined breakpoints are added to a global
list, and the list is checked before executing the next instruction.

Labels

Labels are added to a global list, which can be queries by user commands.

State

// Could also be non-static and passed to functions
static mut DEBUGGER: Debugger = None;

// `None` if debugger is not enabled
type Debugger = Option<DebuggerState>;

struct DebuggerState {
    status: DebuggerStatus,

    intial_memory: Vec<u16>,

    breakpoints: Vec<u16>,
    labels: Vec<(String, u16)>,

    command_history: Vec<String>,
    cli_input: Option<String>, // TODO: Include cursor
}

enum DebuggerStatus {
    WaitForCommand,
    ContinueUntilBreakpoint,
    ContinueUntilEndOfSubroutine,
}

Examples

Set some memory addresses, run program until HALT (or breakpoint), and print a
memory address. The EOF is equivalent to quit, and since the debugger is
quit directly before a HALT, the program will reach the HALT and exit.

lace debug hw.asm << EOF
set x3100 #2
set x3101 #4
continue
get x3102
EOF

Use string argument for debugger input.

lace debug hw.asm -c "registers; continue"

Accommodations

The debugger prompt will always be printed on a new line. This can be
guaranteed by remembering if the last printed character was a newline.

Features that can be implemented later

  • Preserve command history between runs
  • Add mneumonics of commands, such as halt for run halt
@rozukke rozukke added the enhancement New feature or request label Sep 25, 2024
@rozukke
Copy link
Owner

rozukke commented Sep 26, 2024

Just a few thoughts on the proposal:

  • We have the ability to split out subcommands for the cli interface, so using an explicit debug command seems more ergonomic than having it as a flag.
  • -m and -D seem like pretty good options, though we can use lowercase d for the debugger input flag.
  • memset or set seems like a nicer syntax, but that's a nitpick
  • Would reset also recompile? Ideally that use case would be avoided and the program should be restarted entirely to make state management easier for us
  • Register and memory setting can have the same syntax that would make it a bit nicer, i.e. set r0 x1234 or set label1 x1234 or set x1234 x4321. For memory setting, we might want a way to show a label address to save with an offset (if using .blkw)
  • For StaticSource, don't worry too much, that is only relevant for a recompilation on the same run, and should probably be replaced with Rc to make it less hacky

Overall very well considered and a good featureset to aim for. The main point of consideration is making the most common functions as easy to use as possible, which in this case would be memset and register set.

@dxrcy
Copy link
Contributor Author

dxrcy commented Sep 26, 2024

We have the ability to split out subcommands for the cli interface, so using an explicit debug command seems more ergonomic than having it as a flag.

Certainly.

...though we can use lowercase d for the debugger input flag.

How about -i/--input (or -c/--commands), since we have already given the debug subcommand explicitly.

Would reset also recompile?

Not necessarily. The object code can be saved (cloned) before being ran, and restored on reset. As long as all the memory is restored to its original state, as the user could write code that modifies itself.

Register and memory setting can have the same syntax that would make it a bit nicer.

Certainly. Not sure how I overlooked this. This also allowed the display(d) command to be renamed registers(r).

For memory setting, we might want a way to show a label address to save with an offset.

How about optionally specifying the offset value after the label? Eg. .Foo, Foo+2, Foo-2. Whitespace will not be allowed before the +/- symbols, to prevent parsing ambiguity. Perhaps other bases can be supported; the existing number parsing code can be re-used.

@rozukke
Copy link
Owner

rozukke commented Sep 27, 2024

Sounds good, reasonable improvements. I'll have a think about how this would integrate with the current codebase to figure out the first steps going forward.

@dxrcy dxrcy mentioned this issue Oct 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants