Skip to content
This repository has been archived by the owner on Sep 13, 2024. It is now read-only.

Latest commit

 

History

History
336 lines (246 loc) · 11.1 KB

LIBVSL.org

File metadata and controls

336 lines (246 loc) · 11.1 KB

LIBVSL

This documents the API for LIBVSL, and how to use it to make a (very simple) LISP.

Implementing LIBVSL

To create a LISP using LIBVSL, you’ll need at least one C source file, including the libvsl.h header and linking it to the libvsl.a library.

It’s imperative that you include the libvsl.h header once per implementation, and only on the front-facing part of yours. If you need specific functionality, include some of the other header files in this repository (take a look around!).

Run libvsl_init() to init the framework. The libvsl_init() inits memory-related objects; any and all further operations with LISP objects must be made after this function executes.

LIBVSL is a purely symbolic LISP, meaning the only things a user can define explicitly are pure symbols and expressions; meaning no strings, integers, floats or any other literals. LIBVSL itself does not provide any primitive symbols, only the framework to process them. If you want some symbols with defined value, you’ll have to define them after you init the framework.

Symbol tables

LIBVSL comes with an interface for the programmer (and user) to define top-level symbols.

For the programmer defining symbol for top-level, they’ll need to define an array using the CLISP_TAB(table_name) macro. Each element of the array is defined using the CLISP_DEF(symbol_name, symbol_obj) macro.

By the nature of LIBVSL, there are no ‘primitive types’ apart from symbols and expressions. Programmatically this means there’s no int, float or char* fields: everything is stored in a separate memory pool, and the symbol only holds a pointer to it, placing the burden on the programmer to type-cast the data of the symbol.

It’s recommended, if you want a symbol to hold a type not defined in union lisp_obj_m (see lisp.h), for you to use the struct lisp_foreign interface (see lisp.h).

LIBVSL comes with no top-level table by default, so you’ll need to define your own in the same module as you’re defining your symbols. For the top-level, declare a variable with type struct lisp_symtab* and initialize it to NULL. Then, to initialize the symbols defined in your table, run the function clisp_init(<your-top-level>, <your-table>, <your-table-size>), and assign the return value (after asserting it’s not NULL) to <your-top-level>. Calling clisp_init with the first argument as NULL initializes a new table with the values of your-table defined in them. If the first argument is not NULL, it’ll assign them alongside the original symbols of the table, overwriting if needed.

LIBVSL functions

To define LIBVSL top-level functions it’s recommended you use the CLISP_DEFUN(function_name) macro as the body of a normal C function definition, like:

CLISP_DEFUN(foo) {
  ...
}

All of LIBVSL’s functions are of the type lisp_fun (see lisp.h).

As for its arguments:

The functions follow an approach similar to libc’s main, where we have an argument pointer argp (where the function itself is the first argument), an argument counter argv (where the function itself is the first element), and an environment pointer envp, which corresponds to the function’s outer scope.

  1. argp: argp is of the type struct lisp_sexp*, where the SEXP is the ‘proper’ function expression: this means the expression gotten when every argument is evaluated down. For instance:
    (foo (+ 1 2) (* 3 4)) ->
    
    (foo 3 12)
    ; ^ argp is the whole expression
        
  2. argv: argv is not stricly needed. It’s only used to prevent always having to iterate the tree to find the number of arguments. If the top-level is inited, the stack functions will always count the number of current elements before defering to the proper function
  3. envp: envp is of the type struct lisp_symtab*. It contains the current scope for the function

To iterate through the arguments of your function, use the FOR_ARG(arg, args) macro.

An usage might look like:

CLISP_DEFUN(foo) {
  struct lisp_ret_t ret_t = {0};
  register int        ret = 0;
  struct lisp_arg_t  args = {0};

  FOR_ARG(arg, args) {
    // process `arg'...
  }

  done_for((ret_t.slave = (ret? __LISP_ERR: __LISP_OK), ret_t));
}

As for the return type:

The functions return the type struct lisp_ret (see lisp.h), which have a master-slave type:

  1. master: struct lisp_obj: The ‘main’ return type of the function
  2. slave: enum lisp_ret_t: Generic return status. Mainly used for interop with other internals of LIBVSL. It should return __LISP_OK except for LIBVSL-related errors with the __LISP_ERR status. For example:
    CLISP_DEFUN(foo) {
      struct lisp_ret_t ret_t = {0};
      register int        ret = 0;
    
      long* mem = mm_alloc(sizeof(long));
    
      // If `mm_alloc' wasn't able to allocate memory, we return `__LISP_ERR'.
      // If your function has some other form of 'error', you should still
      // return `__LISP_OK' and return your error in some wrapper in `master':
      // reserve `__LISP_ERR' for internals like these
      assert(mem, OR_ERR());
      mm_free(mem);
    
      done_for((ret_t.slave = (ret? __LISP_ERR: __LISP_OK), ret_t));
    }
        

LIBVSL objects

To define LIBVSL top-level objects for the CLISP_DEF macro, it’s recommended you use the CLISP_OBJ(object_type, object_assign) macro. As for object_assign, it should have the syntax of a C field assignment (<field-name> = <field-value>), where <field-name> is any of the fields efined in union lisp_obj_m _, in lisp.h.

This macro is really only useful for defining objects with the data being of some of the primitive types (symbols, SEXPs, Cfunctions, …). For more generic data, use the CLISP_OBJ_GEN(object_type, object_memory) macro. The data will be wrapped in the struct lisp_foreign structure.

The type of primitive objects is defined in the enum lisp_obj_t, in lisp.h.

An usage might look like:

struct lisp_sexp* some_sexp = NULL;
int some_int = 2;

enum my_types {__LISP_MY_NIL = 0, __LISP_MY_INT};

CLISP_TAB(my_tab) {
  CLISP_DEF("my_some_sexp", CLISP_OBJ(__LISP_OBJ_SEXP, exp = some_sexp),
  CLISP_DEF("my_some_int", CLISP_OBJ_GEN(__LISP_MY_INT, &some_int),
};

// ...

For functions, we have two primitive types:

  1. __LISP_OBJ_CFUN
  2. __LISP_OBJ_CFUN_LIT

Where the first one is a generic function, and the second one is more like a macro: it quotes all its elements.

Top-level

To start programming in LIBVSL, you’ll need to init a top-level. Depending on the kind of input, you’ll have different functions to call:

  1. user defined expressions (stdin): lisp_toplevel_lex(<top-level>)
  2. programmer defined expressions (string): lisp_toplevel_str(<top-level>, <string>)
  3. user/programmer defined expressions (SEXP): lisp_toplevel_exp(<top-level>, <sexp>)

Where <top-level> is your top-level symbol table. If you have C-defined symbols, init them through clisp_init (see previous section).

REPL-like behaviour can be achieved via the first function.

Implementation example

Here’s an implementation that defines a LISP with a single function called hello-world that just prints “hello world”. If you’ve configured and compiled LIBVSL correctly, you should be able to compile the code below.

Source code (myvsl.c):

#include "libvsl.h"

#include <stdio.h>

CLISP_DEFUN(myvsl_hello_world) {
    printf("hello world");
    return (struct lisp_ret) {0};
}

static CLISP_TAB(myvsl_tab) {
    // (hello-world)
    CLISP_DEF("hello-world",
              CLISP_OBJ(__LISP_TYP_FUN, fun = myvsl_hello_world)),
};

int main(void) {
  register int ret = 0;

  MAYBE_INIT(libvsl_init());

  struct lisp_symtab* tab = NULL;

  tab = clisp_init(tab, myvsl_tab, SIZEOF(myvsl_tab));
  assert(tab, OR_ERR());

  ret = lisp_toplevel_lex(tab);

  done_for(ret);
}

Building:

$ make libvsl.a
$ cc -I. -L. -lvsl myvsl.c -o myvsl
$ ./myvsl
(hello-world)
hello world
(something-thats-not-defined)
[ !! ] libvsl: symtab: symbol was not found
$

Programming in LIBVSL

Reading/Writing memory

For actually programming in LIBVSL, the way you would define something is with the struct lisp_sym* lisp_symtab_set(string_ip sym_str, struct lisp_obj sym_obj, struct lisp_symtab* sym_tab) function. The sym_str is a string iterator, which can be gotten from the lexer with lex.symbuf or from a Cstring with the string_ip to_string_ip(char* string) function (this doesn’t allocate new memory). This function returns a pointer to the newly allocated symbol: NULL if some error occured.

This function will overwrite any other symbol of the same name, if it already exists. If you want to set or get instead, use lisp_symtab_sets: it has the same signature as lisp_symtab_set, but it doesn’t overwrite the symbol if it already exists.

To get a symbol from the symbol table given its string representation, use struct lisp_sym* lisp_symtab_get(string_ip sym_str, struct lisp_symtab* sym_tab).

Call stack

The call stack of LIBVSL depends on the top-level called. For the lexer, struct lisp_ret lisp_stack_frame_lex(struct lisp_symtab* envp) is called. For and SEXP struct lisp_ret lisp_stack_frame_sexp(struct lisp_symtab* envp) is called.

User defined functions (lambdas) call the lisp_eval function, which has the same type of lisp_fun (see lisp.h). The stack will call lisp_eval with argp[0] being a pointer to an SEXP which should contain other expressions to be evaluated.

Cfunctions are called directly in the stack.

Memory management

LIBVSL limitations

By design, any frontend implementation of LIBVSL will have some limitations.

LIBVSL LISP concept differences

No CONS

There are some internal differences that LIBVSL has with other LISPs. The most glaring of which is the fact we have no cons native data structure. That is, a list expression is not stored like:

(foo bar baz)
this: [foo|*]->[bar|*]->[baz|=]

(foo (bar baz))
or this: [foo|*]->[[bar|*]->[baz|=]|=]

(‘*’ points to an SEXP, ‘=’ is empty)

where SEXPs are treated like CONS cells. Instead, we internally store SEXPs as the full expression with binary trees, and when we have more than two elements, the rest of the expression is parented by so-called /LEXP/s. The struct lisp_sexp type of LIBVSL store SEXPs as a left and right node. That means that we can ascertain where the expression ends if the right node is a symbol or doesn’t exist, instead of resulting to ‘empty’ ends.

To use the example above, they would look like:

(foo bar baz)
this: [foo|>]->[bar|baz]

(foo (bar baz))
or this: [foo|[bar|baz]]

(‘>’ points to an LEXP)

Internally, we cannot evaluate expressions parented by LEXPs, as they serve as a sort of “continuation” of its own parent. So instead of car and cdr, we use head and tail, which are diferent because:

(car '(foo (bar baz))) -> ((bar baz))
(tail '(foo (bar baz))) -> (bar baz)