-
Notifications
You must be signed in to change notification settings - Fork 12
[Idea] Brainfuck Assembly #73
Comments
No whitespace (other than newlines) is significant. // Syntax:
// @x = address x
// 'y: = label y
// @x[n] = perform pointer arithmetic (n must be a constant)
// b'c' = byte constant given the ascii value of the given character (sugar for writing the actual number)
'main:
// allocate 4 bytes
mem @a[4]
// could also use b'A', etc.
incr @a[0] 65 // A
incr @a[1] 66 // B
incr @a[2] 67 // C
incr @a[3] 68 // D
// default size is 1 byte
mem @b
// move the value of the memory at address @a to address @b
// leaves @a as zero
// both the source and the target of this operation must be the same size
move @a[2] -> @b
// marks memory as available for use but does not zero it until necessary
// @b can be longer be used
drop @b
// move is more efficient and requires no temporary cell, but copy is useful if you want to keep the memory around
// both the source and the target of this operation must be the same size
mem @c[4]
copy @a -> @c
// using a constant like this is syntactic sugar for incrementing its value into a temporary memory allocation and then dropping the memory
// call cannot be used in a way that causes a cycle (e.g. recursion) because of inlining (see loop for a way to do loops) - maximum recursion depth is used to avoid infinite compilation
call foo b'x' @c
// this ret is required
ret
// Function blocks can have zero or more arguments
// If no size is given, the default is 1 byte
// All allocated memory inside a function block is dropped at the end
fn foo x y[4] {
write x
write y[3]
// returns to the instruction after where this was called
// without this, there's a chance that we would keep going to the next instruction
// the only way that couldn't happen is if the next thing was a function block (in which case you would get a compilation error)
ret
} Other instructions: // Tests if @a is non zero and jumps to the second argument
if @a ('foo 33 @a) 'bar
// Continues to loop while the value at the given memory address is non-zero
loop @a 'body
// Reads n bytes from the input into @a where n is the declared size of @a
read @a
// Decrement is the mirror operation to incr
decr @a b'0'
// Set all the bytes at the given address to zero
zero @a Drop implementation: Todo: determine if manual memory management is actually useful or if we should just have blocks. Add macros for doing things like generating the code to print any string |
Old notes from Jul 16: The operations that we have defined as our IR before code generation can be written out as their own little low-level language. // The only types are memory blocks denoted [u8; non-negative size] and u8 (a single-byte)
// u8 means "byte" (8-bits) and the size is the size of the memory block in bytes
// Every variable must have a type
let q: [u8; 4];
// can be uninitialized (memory defaults to zero)
let a: [u8; 3];
// can be initialized to a literal
let b: [u8; 3] = [111, 244, 0];
// can be initialized to a single, repeating value
// size can be inferred
let c: [u8; _] = [23; 100];
// variables must be explicitly copied
//WRONG: a = b;
//RIGHT:
copy b -> a;
// move instead of copying (saves instructions)
let d: [u8; 3];
move b -> d;
// We are currently in the top most block known as the "root" block
// blocks can be arbitrarily nested
{
// variables are only declared in their block and blocks below
let q: [u8; 1]; // only available until the end of this block
{
let r: [u8; 3]; // this memory will be made available again at the end of this block
copy a -> r;
}
let s: [u8; 100];
}
// Set some memory to zero
zero a;
// read into a memory block
read a;
// write memory to output
write a;
// indexing is supported
write a[1..];
// incrementing and decrementing
a[3] += 33;
a[3] -= 37; |
This brain assembly language can be used as an IR for this compiler. It has a simple bytes-only memory representation and is in a sense much "lower level" than brain.
Notes
Recursion (more than tail call) can be implemented if we implement a stack in bf. Each call operator would go to a new level of the stack (dynamically) instead of trying to determine that stuff statically.
https://cs.stackexchange.com/questions/991/are-there-minimum-criteria-for-a-programming-language-being-turing-complete
OLD NOTES
Add Span to label so the hash is unique
No whitespace (other than newlines) is significant.
Other instructions:
Drop implementation:
Cells are allocated as each mem call is reached. When drop is called, the cells are marked "dropped", but not zeroed. Cells are not dropped when ret is called. Memory management must be done manually. If the compiler can prove the cells have been zeroed (e.g. if the entire contents were moved), it marks them as "zeroed". When looking for the next cells to allocate, the allocator looks for the nearest cell to the current position that is marked "dropped". If the cells are not marked "zeroed", instructions are inserted to zero those cells before doing anything to that memory. If no dropped cells are available or if it would be closer to just use some memory at the beginning or end of the tape, that memory is preferred.
Todo: determine if manual memory management is actually useful or if we should just have blocks.
The text was updated successfully, but these errors were encountered: