diff --git a/docs/design/tracker.md b/docs/design/tracker.md new file mode 100644 index 0000000..81a6065 --- /dev/null +++ b/docs/design/tracker.md @@ -0,0 +1,93 @@ +# The symbol tracker +As with most compilers & programming languages, Chirp compiler needs to track various names declared by the user program. +It needs to resolve identifiers to variables, parameters, namespaces, functions, types (TODO), etc. +It also needs to track various attributes and properties of symbols, like full access path (if available), whether it is global, etc. +All of that is provided and managed by the symbol tracker. + +## Symbols +Symbols keep a record of various program entities. The complete list is as follows: +- Top scope - The root scope of the whole program +- Global variables - Variables that live in the top scope or in namespace scope +- Global functions +- Namespaces +- Nested scopes - Usually unnamed entities that live within some other entity's scope, like compound block scopes. + +Every symbol has a parent symbol (except the root symbol, which has none), an optional local name (name that is accessible in the symbol's scope), and an optional global name (which is the path used to access the symbol from global scope). +A symbol can also be tied to a particular AST node which introduced it, so further information can be obtained (like type, kind of symbol, etc.). + +## Scope +A scope defines a subset of visible symbols that are referencable at a particular point in a program. No two symbols of the same name (unless they're describing the same symbol) can exist within the same scope's local set. + +A scope's local set specifies symbols and hides (temporary exclusions of symbols from the scope), which are unique to that scope. A name defined in this set is not visible outside of it, but it's visible in subscopes defined within this scope (unless a subscope hides it). + +A name shadowing occurs when a scope defines a symbol (or lack therof, i.e. a hide) with the same name as a name in an enclosing scope. + +Scopes are tied to symbols (which can be unnamed and local). This allows for lookup within a scope when a name is encountered. + +## Name lookup +There are two kinds of lookup: unqualified and qualified. An unqualified lookup usually occurs when an identefier is to be resolved. +Qualified lookup occurs when a name is to be resolved in the scope of another symbol. + +When a qualified identifier is looked up, the first part of the indentifier is resolved by means of unqualified lookup, and then the following parts are resolved in scope of the previous resolved symbol by qualified lookup. + +### Unqualified lookup +When an unqualified lookup occurs, the following steps are taken. All scopes, starting at the current nested scope, are examined in the order of less and less nested, for the searched name. When a symbol is found (or a lack of one is explicitly defined in a scope), the lookup stops, and the result (whether a symbol was found or not) is returned. When the search reaches the end of list of scopes, the lookup fails with no symbol returned. + +### Qualified lookup +Qualified lookup considers only the scope within it occurs. When no name is defined, lookup fails with no symbol. When a name is found, the result (a symbol or not) becomes the result of the lookup. + +# Tracker API +The tracker tracks all symbols used in the file currently processed. The first symbol corresponds to the root of the program syntax tree. + +Creating and binding symbols is done with these instance methods: + +```c++ +symbol* decl_sym(); +symbol* decl_sym(identifier const& name, decl& target); +``` +This method creates a new symbol, and optionally assigns a name (makes it named: see `has_name`) and a target node it. + +```c++ +bool bind_sym(symbol* sym); +``` +This method binds a symbol to the current scope. It returns true on success. On failure, returns false & reports the proper diagnostics where appropiate. + +Looking up symbols is done through following instance methods: + +```c++ +symbol* find_sym_cur(identifier const& name); +``` +This is a low-level function that only searches the current scope. If the symbol is not found within the current scope, returns null. + +```c++ +symbol* lookup_sym(identifier const& name); +symbol* lookup_sym_qual(qual_identifier const& name); +``` +These two methods perform unqualified and qualified lookup, respectively, on identifiers. The first one doesn't report diagnostics on failure, but the second one does. + +```c++ +symbol* lookup_decl_sym(decl const& decl_scope, identifier const& name); +``` +This low-level function performs a qualified lookup inside the given symbol to find a name. It considers only the scope of provided symbol. It reports diagnostics on failure. + +These functions deal with scopes: + +```c++ +void push_scope(symbol* sym); +``` +This method creates and enters a new scope, described by the provided symbol. + +```c++ +void pop_scope(); +``` +This method exits current nested scope and goes back to the one directly embedding it. Exiting the main program (global) scope is undefined. + +## Symbol attributes +Each symbol has a following set of attributes +| Name | Type | Default value | Description | +| --- | --- | --- | :-- | +| `has_name` | `bool` | `false` | Has a `name` | +| `is_global` | `bool` | `true` | Lives in global scope (can be potentially exported) | +| `has_storage` | `bool` | `false` | Defines a concrete entity that exists in produced object code (variables, functions) | +| `is_entry` | `bool` | `false` | Is an entry declaration | +| `is_scope` | `bool` | `false` | Defines a scope (see: [Scope](#scope)) | diff --git a/lib/io.chp b/lib/io.chp new file mode 100644 index 0000000..9acdd02 --- /dev/null +++ b/lib/io.chp @@ -0,0 +1,22 @@ +# Include this or not? +namespace io +{ + extern "fputs" + func int __libc_fputs(ptr const char: string, ptr: fileio); + extern "fputc" + func int __libc_fputc(int: ch, ptr: fileio); + extern "stdout" + ptr: __libc_stdout; + + func none write(ptr const char: string) + { + __libc_fputs(string, __libc_stdout); + } + + func none print(ptr const char: string) + { + __libc_fputs(string, __libc_stdout); + # Newline + __libc_fputc(10, __libc_stdout); + } +} diff --git a/lib/mem.chp b/lib/mem.chp new file mode 100644 index 0000000..8edf962 --- /dev/null +++ b/lib/mem.chp @@ -0,0 +1,5 @@ +namespace mem +{ + extern "malloc" + func ptr alloc(unsigned long: size); +} diff --git a/samples/features.chp b/samples/features.chp index 611ad01..e2334e3 100644 --- a/samples/features.chp +++ b/samples/features.chp @@ -9,7 +9,7 @@ import "math.trig"; namespace mem { extern "malloc" - func ptr alloc(long: size); + func ptr alloc(unsigned long: size); } namespace io @@ -28,8 +28,8 @@ namespace oslib # Forward function declarations func ptr const char type_name(int: id); -func ptr const char find_cstr_end(ptr const char); # Unnamed parameters (TODO) -func none memcpy(ptr: dest, ptr const: src, int: count); +func ptr const char find_cstr_end(ptr const char); # Unnamed parameters +func ptr memcpy(ptr: dest, ptr const: src, unsigned long: count); # Functions func ptr const char type_name(int: id) @@ -64,18 +64,21 @@ func ptr const char find_cstr_end(ptr const char: s) ret s; } -func none memcpy(ptr: dest, ptr const: src, int: count) +func ptr memcpy(ptr: dest, ptr const: src, unsigned long: count) { # Pointer convertions (TODO: type-check) - ptr byte: _dest = dest as ptr char; - ptr const byte: _src = src as ptr const char; - # Convert to bool (TODO) + ptr byte: _dest = dest as ptr byte; + ptr const byte: _src = src as ptr const byte; + # Convert to bool while count { # lvalue assignment (TODO) deref _dest = deref _src; count = count - 1; + _dest = _dest + 1; + _src = _src + 1; } + ret dest; } # Program entry point @@ -84,12 +87,15 @@ entry # Function calls ptr const char: my_string = "hello"; ptr const char: my_string_end = find_cstr_end(my_string); - const int: my_string_size = my_string_end - my_string + 2; + const int: my_string_size = my_string_end - my_string + 1; ptr char: my_heap_str = mem.alloc(my_string_size); memcpy(my_heap_str, my_string, my_string_size); - const char: a = 0; # Empty string + ptr char: a = alloca(char) 3; + deref a = 'h'; + deref (a + 1) = 'i'; + deref (a + 2) = 0; io.print(my_heap_str); io.print(" "); io.print(type_name(1)); - io.print(ref a); + io.print(a); } diff --git a/samples/fib.chp b/samples/fib.chp index 6755f6a..6daa4d3 100644 --- a/samples/fib.chp +++ b/samples/fib.chp @@ -1,7 +1,7 @@ -func int fib(int: n) +func unsigned long fib(unsigned int: n) { - int: a = 0; - int: b = 1; + unsigned long: a = 0; + unsigned long: b = 1; while n != 0 { const int: tmp = b; @@ -15,10 +15,65 @@ func int fib(int: n) import "io" namespace io { - func none print(ptr const char: msg, int: param); + func none write(ptr const char: str); + func none print(ptr const char: str); + + # Returns number of chars needed to store the string, including zero-terminator + func unsigned long ulong_to_string(unsigned long: val, ptr char: str, unsigned long: buflen) + { + unsigned long: size = 1; + unsigned long: _val = val; + unsigned long: ct = 1; + unsigned long: idx = 0; + while _val / ct >= 10 + { + ct = ct * 10; + size = size + 1; + } + while ct != 1 + { + if idx == buflen + { + ct = 1; + } + else + { + ct = ct / 10; + deref(str + idx) = _val / ct + '0'; + _val = _val - _val / ct * ct; + idx = idx + 1; + } + } + if idx != buflen + { + deref(str + idx) = 0; + } + ret size + 1; + } } entry { - io.print("Result of fib(5): ", fib(5)); + ptr char: small_buf = alloca(char) 5; + unsigned long: buf_size = 5; + unsigned long: value = fib(5); + unsigned long: size = io.ulong_to_string(value, small_buf, buf_size); + if size > buf_size + { + buf_size = size; + small_buf = alloca(char) buf_size; + io.ulong_to_string(value, small_buf, buf_size); + } + io.write("Result of fib(5): "); + io.print(small_buf); + value = fib(50); + size = io.ulong_to_string(value, small_buf, buf_size); + if size > buf_size + { + buf_size = size; + small_buf = alloca(char) buf_size; + io.ulong_to_string(value, small_buf, buf_size); + } + io.write("Result of fib(50): "); + io.print(small_buf); } diff --git a/src/ast/ast.hpp b/src/ast/ast.hpp index febbc92..0c1ef42 100644 --- a/src/ast/ast.hpp +++ b/src/ast/ast.hpp @@ -6,8 +6,8 @@ All possible AST nodes are defined here */ #include "types.hpp" #include "../shared/location_provider.hpp" +#include "../shared/system.hpp" #include -#include #include // === SHARED === @@ -18,20 +18,25 @@ class ast_node }; // Forward declarations +class tracker_symbol; + class ast_root; class identifier; +class raw_qual_identifier; class qual_identifier; + class expr; class binop; class unop; class arguments; class func_call; class id_ref_expr; -class loperand; class string_literal; class integral_literal; class nullptr_literal; class cast_expr; +class alloca_expr; + class decl; class var_decl; class entry_decl; @@ -41,6 +46,7 @@ class namespace_decl; class parameters; class func_decl; class func_def; + class stmt; class decl_stmt; class assign_stmt; @@ -126,12 +132,17 @@ class identifier : public ast_node } }; -class qual_identifier : public ast_node +class raw_qual_identifier { public: // The parts vector for a.b.c.foo() would be: // {"a","b","c","foo"}.. Further in vector => More nested std::vector parts; +}; + +class qual_identifier : public ast_node, public raw_qual_identifier +{ + public: bool is_global = false; // Start at global namespace }; @@ -148,35 +159,7 @@ enum class expr_kind intlit, nulllit, cast, -}; - -enum class exprcat -{ - unset, // unknown/unassigned type - lval, // lvalue, i.e. has memory location - rval, // rvalue, i.e. a pure value (not tied to any object, can be used as an operand) - error, // result of an invalid operation -}; - -struct basic_type -{ - // Type modifiers 'exttp' are stored in reverse order of declaration, for easier manipulation - // For example, `ptr unsigned char` -> basic_type { .basetp = _char, .exttp = [_unsigned, _ptr] } - dtypename basetp; // The basic type specifier - std::vector exttp; // Enums are cast to/from a byte bc why not - - basic_type() - : basetp(dtypename::_none) {} - - bool operator==(basic_type const& o) const - { - return basetp == o.basetp and exttp == o.exttp; - } - - bool operator!=(basic_type const& o) const - { - return !operator==(o); - } + alloca, }; class expr : public ast_node @@ -188,6 +171,11 @@ class expr : public ast_node basic_type type; exprcat cat; + bool has_error() const + { + return cat == exprcat::error; + } + protected: expr(expr_kind kind) : kind(kind), cat(exprcat::unset) {} @@ -234,17 +222,6 @@ class func_call : public expr : expr(expr_kind::call), callee(std::move(callee)), args(std::move(args)) {} }; -// Left-Side stuff - -// Left Side Operand type - -enum class loptype -{ - access, // Array Accessor [] - ident, - lunop, // left unary operator -}; - class id_ref_expr : public expr { public: @@ -263,15 +240,6 @@ class id_ref_expr : public expr } }; -#if 0 -class loperand : public expr -{ - public: - std::unique_ptr node; - loptype type; -}; -#endif - class string_literal : public expr { public: @@ -280,14 +248,6 @@ class string_literal : public expr string_literal() : expr(expr_kind::strlit) {} }; -struct integer_value -{ - int64_t val; - - integer_value() = default; - constexpr integer_value(int64_t v) : val(v) {} -}; - class integral_literal : public expr { public: @@ -305,12 +265,36 @@ class nullptr_literal : public expr nullptr_literal() : expr(expr_kind::nulllit) {} }; +enum class cast_kind +{ + _invalid, // Invalid cast operation + _explicit, // Cast spelled in the program, (expression) as(type) + _const, // Cast between const-qualified and non-const-qualified types + _grade, // Cast between types of different sizes/characteristics + _sign, // Cast between integral types of different signs (signed, unsigned, unspecified) + _cat, // Value category conversion + _float, // Floating point conversion, between floats and ints + _bool, // Boolean conversion +}; + class cast_expr : public expr { public: exprh operand; + cast_kind ckind; - cast_expr() : expr(expr_kind::cast) {} + cast_expr(cast_kind ckind) + : expr(expr_kind::cast), ckind(ckind) {} +}; + +class alloca_expr : public expr +{ + public: + basic_type alloc_type; + exprh size; + + alloca_expr(basic_type type, exprh size) + : expr(expr_kind::alloca), alloc_type(std::move(type)), size(std::move(size)) {} }; // === Declarations === @@ -332,7 +316,11 @@ class decl : public ast_node public: using node_base = decl; // For node type identification + decl(decl const&) = delete; + decl& operator=(decl const&) = delete; + decl_kind kind; + tracker_symbol* symbol = nullptr; // Assiociated symbol protected: decl(decl_kind kind) : kind(kind) {} @@ -341,15 +329,6 @@ class decl : public ast_node class ast_root : public decl { public: - #if 0 - // Vectors are in order - std::vector> imports; - std::vector> externs; - std::vector> nspaces; - std::vector> fdecls; - std::vector> fdefs; - std::unique_ptr entry; - #endif std::vector top_decls; entry_decl* entry = nullptr; @@ -386,6 +365,7 @@ class extern_decl : public decl { public: std::string real_name; + token_location name_loc; declh inner_decl; extern_decl() : decl(decl_kind::external) {} @@ -449,6 +429,9 @@ class stmt : public ast_node public: using node_base = stmt; // For node type identification + stmt(stmt const&) = delete; + stmt& operator=(stmt const&) = delete; + // There's a bunch of them so gotta make another enum smh stmt_kind kind; diff --git a/src/ast/ast_deleters.cpp b/src/ast/ast_deleters.cpp index 95148fe..d87decd 100644 --- a/src/ast/ast_deleters.cpp +++ b/src/ast/ast_deleters.cpp @@ -23,6 +23,7 @@ void expr_node_deleter::operator()(expr* node) const case expr_kind::cast: return delete static_cast(node); } + chirp_unreachable("Deleting unknown expr node"); } void decl_node_deleter::operator()(decl* node) const @@ -46,6 +47,7 @@ void decl_node_deleter::operator()(decl* node) const case decl_kind::external: return delete static_cast(node); } + chirp_unreachable("Deleting unknown decl node"); } void stmt_node_deleter::operator()(stmt* node) const @@ -69,4 +71,5 @@ void stmt_node_deleter::operator()(stmt* node) const case stmt_kind::null: return delete static_cast(node); } + chirp_unreachable("Deleting unknown stmt node"); } diff --git a/src/ast/ast_dumper.cpp b/src/ast/ast_dumper.cpp index 00eb27c..c5b123f 100644 --- a/src/ast/ast_dumper.cpp +++ b/src/ast/ast_dumper.cpp @@ -1,6 +1,4 @@ #include "ast_dumper.hpp" -#include "../color.hpp" -#include "ast.hpp" #include #include @@ -14,41 +12,17 @@ constexpr color c_color_top_level = color::blue | color::bright | color::bold; constexpr color c_color_top_level_unavail = color::red | color::bright | color::bold; constexpr color c_color_type = color::green; constexpr color c_color_type_cat = color::green | color::blue; +constexpr color c_color_type_error = color::red | color::bright | color::bold; constexpr color c_color_expr = color::blue | color::bright | color::bold; constexpr color c_color_decl = color::green | color::bright | color::bold; constexpr color c_color_stmt = color::red | color::blue | color::bright | color::bold; constexpr color c_color_identifier = color::blue | color::bright; constexpr color c_color_location = color::red | color::green; -static std::string indent(int x) +void text_dumper_base::indent(int depth) { - std::string result; - result.resize(x * 3, ' '); - return result; -} - -void text_ast_dumper::write_color(std::string txt, color c) { - if (has_colors) { - #ifdef __unix__ - // Doesn't care if it's on a VT100 terminal or not - // will do coloring anyway. - std::cout << "\033["; - unsigned int col = static_cast(c); - if ((c & color::bright) != color::blank) - std::cout << (90 + (col & 7)); - else - std::cout << (30 + (col & 7)); - if ((c & color::bold) != color::blank) - std::cout << ";1"; - std::cout << 'm'; - #endif - } - std::cout << txt; - if (has_colors) { - #ifdef __unix__ - std::cout << "\033[m"; - #endif - } + for (int i = 0; i < depth * 3; ++i) + std::cout << ' '; } void text_ast_dumper::print_location(location_range loc) { @@ -59,7 +33,7 @@ void text_ast_dumper::print_location(location_range loc) { } // I don't even care about names now -static std::string dump_dtname(dtypename n) +static char const* dump_dtname(dtypename n) { switch (n) { @@ -79,12 +53,11 @@ static std::string dump_dtname(dtypename n) return "bool"; case dtypename::_none: return "none"; - default: - return "unknown"; } + chirp_unreachable("dump_dtname"); } -static std::string dump_dtmod(dtypemod m) +static char const* dump_dtmod(dtypemod m) { switch (m) { @@ -98,12 +71,34 @@ static std::string dump_dtmod(dtypemod m) return "const"; case dtypemod::_func: return "func"; - default: - return ""; } + chirp_unreachable("dump_dtmod"); } -static std::string dump_exprcat(exprcat c) +static char const* dump_cast_kind(cast_kind k) { + switch (k) + { + case cast_kind::_invalid: + return "invalid"; + case cast_kind::_explicit: + return "explicit_cast"; + case cast_kind::_const: + return "const_cast"; + case cast_kind::_grade: + return "grade_cast"; + case cast_kind::_sign: + return "sign_cast"; + case cast_kind::_cat: + return "cat_cast"; + case cast_kind::_float: + return "float_cast"; + case cast_kind::_bool: + return "bool_cast"; + } + chirp_unreachable("dump_cast_kind"); +} + +static char const* dump_exprcat(exprcat c) { switch (c) { @@ -116,10 +111,10 @@ static std::string dump_exprcat(exprcat c) case exprcat::error: return "error"; } - __builtin_unreachable(); + chirp_unreachable("dump_exprcat"); } -std::string exprop_id(tkn_type op) +char const* exprop_id(tkn_type op) { switch (op) { @@ -151,6 +146,8 @@ std::string exprop_id(tkn_type op) return "ref"; case tkn_type::deref_op: return "deref"; + case tkn_type::kw_alloca: + return "alloca"; // Assignments case tkn_type::assign_op: return "="; @@ -166,6 +163,9 @@ std::string exprop_id(tkn_type op) void text_ast_dumper::dump_ast(ast_root const& root) { + location_run run; + if (loc_prov) + loc_prov->begin_run(run); write_color("Top Level:\n", c_color_top_level); #if 0 @@ -254,11 +254,13 @@ void text_ast_dumper::dump_ast(ast_root const& root) else { write_color("-- No entry --\n", c_color_top_level_unavail); } + if (loc_prov) + loc_prov->end_run(); } void text_ast_dumper::dump_identifier(identifier const& n) { - std::cout << indent(depth); + indent(depth); write_color("identifier ", c_color_identifier); print_location(n.loc); std::cout << ' ' << n.name; @@ -267,15 +269,15 @@ void text_ast_dumper::dump_identifier(identifier const& n) void text_ast_dumper::dump_qual_identifier(qual_identifier const& n) { - std::cout << indent(depth); + indent(depth); write_color("qual_identifier ", c_color_identifier); print_location(n.loc); std::cout << ' '; - int i = 0; + int i = 0, len = n.parts.size(); for (auto const& id : n.parts) { std::cout << id.name; - if (++i != n.parts.size()) + if (++i != len) { std::cout << "."; } @@ -358,13 +360,12 @@ void text_ast_dumper::dump_stmt(stmt const& node) void text_ast_dumper::dump_basic_type(basic_type const& type) { - std::cout << indent(depth); + indent(depth); std::cout << "basic_type "; // Remember that order is reversed for (auto it = type.exttp.rbegin(), end = type.exttp.rend(); it != end; ++it) { - dtypemod w = static_cast(*it); - write_color(dump_dtmod(w), c_color_type); + write_color(dump_dtmod(*it), c_color_type); std::cout << ' '; } write_color(dump_dtname(type.basetp), c_color_type); @@ -373,15 +374,19 @@ void text_ast_dumper::dump_basic_type(basic_type const& type) void text_ast_dumper::dump_expr_type(basic_type const& type, exprcat cat) { - std::cout << indent(depth); + indent(depth); std::cout << "expr_type "; - write_color(dump_exprcat(cat), c_color_type_cat); + if (cat == exprcat::error or cat == exprcat::unset) + begin_color(c_color_type_error); + else + begin_color(c_color_type_cat); + std::cout << dump_exprcat(cat); + end_color(); std::cout << ' '; // Remember that order is reversed for (auto it = type.exttp.rbegin(), end = type.exttp.rend(); it != end; ++it) { - dtypemod w = static_cast(*it); - write_color(dump_dtmod(w), c_color_type); + write_color(dump_dtmod(*it), c_color_type); std::cout << ' '; } write_color(dump_dtname(type.basetp), c_color_type); @@ -390,7 +395,7 @@ void text_ast_dumper::dump_expr_type(basic_type const& type, exprcat cat) void text_ast_dumper::dump_binop(binop const& n) { - std::cout << indent(depth); + indent(depth); write_color("binop ", c_color_expr); print_location(n.loc); std::cout << " ("; @@ -408,7 +413,7 @@ void text_ast_dumper::dump_binop(binop const& n) void text_ast_dumper::dump_unop(unop const& n) { - std::cout << indent(depth); + indent(depth); write_color("unop ", c_color_expr); print_location(n.loc); std::cout << " ("; @@ -425,7 +430,7 @@ void text_ast_dumper::dump_unop(unop const& n) void text_ast_dumper::dump_arguments(arguments const& n) { - std::cout << indent(depth); + indent(depth); write_color("arguments ", c_color_expr); print_location(n.loc); std::cout << '\n'; @@ -439,7 +444,7 @@ void text_ast_dumper::dump_arguments(arguments const& n) void text_ast_dumper::dump_func_call(func_call const& n) { - std::cout << indent(depth); + indent(depth); write_color("function_call ", c_color_expr); print_location(n.loc); std::cout << '\n'; @@ -453,7 +458,7 @@ void text_ast_dumper::dump_func_call(func_call const& n) void text_ast_dumper::dump_id_ref_expr(id_ref_expr const& n) { - std::cout << indent(depth); + indent(depth); write_color("id_ref_expr ", c_color_expr); print_location(n.loc); std::cout << ' '; @@ -477,7 +482,7 @@ void text_ast_dumper::dump_id_ref_expr(id_ref_expr const& n) /*void text_ast_dumper::dump_loperand(loperand const& n) { - std::cout << indent(depth); + indent(depth); write_color("left_operand:", c_color_expr); print_location(n.loc); std::cout << '\n'; @@ -485,7 +490,7 @@ void text_ast_dumper::dump_id_ref_expr(id_ref_expr const& n) void text_ast_dumper::dump_string_literal(string_literal const& n) { - std::cout << indent(depth); + indent(depth); write_color("string_literal ", c_color_expr); print_location(n.loc); std::cout << " \""; @@ -502,7 +507,7 @@ void text_ast_dumper::dump_string_literal(string_literal const& n) void text_ast_dumper::dump_integral_literal(integral_literal const& n) { - std::cout << indent(depth); + indent(depth); write_color("integral_literal ", c_color_expr); print_location(n.loc); std::cout << ' '; @@ -515,7 +520,7 @@ void text_ast_dumper::dump_integral_literal(integral_literal const& n) void text_ast_dumper::dump_nullptr_literal(nullptr_literal const& n) { - std::cout << indent(depth); + indent(depth); write_color("nullptr_literal ", c_color_expr); print_location(n.loc); std::cout << '\n'; @@ -529,13 +534,21 @@ void text_ast_dumper::dump_nullptr_literal(nullptr_literal const& n) void text_ast_dumper::dump_cast_expr(cast_expr const& n) { - std::cout << indent(depth); + indent(depth); write_color("cast_expr ", c_color_expr); print_location(n.loc); + std::cout << ' '; + if (n.ckind == cast_kind::_invalid) + begin_color(c_color_type_error); + else + begin_color(c_color_type_cat); + std::cout << dump_cast_kind(n.ckind); + end_color(); std::cout << '\n'; ++depth; - // Exception: always print type of a cast expression - dump_expr_type(n.type, n.cat); + // Exception: always print type of a cast expression, if it's an explicit cast + if (show_expr_types or n.ckind == cast_kind::_explicit) + dump_expr_type(n.type, n.cat); dump_expr(*n.operand); --depth; } @@ -544,7 +557,7 @@ void text_ast_dumper::dump_cast_expr(cast_expr const& n) void text_ast_dumper::dump_var_decl(var_decl const& n) { - std::cout << indent(depth); + indent(depth); write_color("variable_declaration ", c_color_decl); print_location(n.loc); std::cout << '\n'; @@ -558,7 +571,7 @@ void text_ast_dumper::dump_var_decl(var_decl const& n) void text_ast_dumper::dump_entry_decl(entry_decl const& n) { - std::cout << indent(depth); + indent(depth); write_color("entry_declaration ", c_color_decl); print_location(n.loc); std::cout << ' '; @@ -571,12 +584,13 @@ void text_ast_dumper::dump_entry_decl(entry_decl const& n) void text_ast_dumper::dump_import_decl(import_decl const& n) { - std::cout << indent(depth); + indent(depth); write_color("import_declaration ", c_color_decl); print_location(n.loc); std::cout << '\n'; ++depth; - std::cout << indent(depth) << "filename: \""; + indent(depth); + std::cout << "filename: \""; write_color(n.filename, c_color_identifier); std::cout << "\"\n"; --depth; @@ -584,13 +598,14 @@ void text_ast_dumper::dump_import_decl(import_decl const& n) void text_ast_dumper::dump_extern_decl(extern_decl const& n) { - std::cout << indent(depth); + indent(depth); write_color("extern ", c_color_decl); print_location(n.loc); std::cout << '\n'; ++depth; - std::cout << indent(depth) << "real name: \""; + indent(depth); + std::cout << "real name: \""; write_color(n.real_name, c_color_identifier); std::cout << "\"\n"; dump_decl(*n.inner_decl); @@ -599,7 +614,7 @@ void text_ast_dumper::dump_extern_decl(extern_decl const& n) void text_ast_dumper::dump_namespace_decl(namespace_decl const& n) { - std::cout << indent(depth); + indent(depth); write_color("namespace_declaration ", c_color_decl); print_location(n.loc); std::cout << '\n'; @@ -615,7 +630,7 @@ void text_ast_dumper::dump_namespace_decl(namespace_decl const& n) void text_ast_dumper::dump_parameters(parameters const& n) { - std::cout << indent(depth); + indent(depth); write_color("parameters ", c_color_decl); print_location(n.loc); std::cout << '\n'; @@ -629,7 +644,7 @@ void text_ast_dumper::dump_parameters(parameters const& n) void text_ast_dumper::dump_func_decl(func_decl const& n) { - std::cout << indent(depth); + indent(depth); write_color("function_declaration ", c_color_decl); print_location(n.loc); std::cout << '\n'; @@ -642,7 +657,7 @@ void text_ast_dumper::dump_func_decl(func_decl const& n) void text_ast_dumper::dump_func_def(func_def const& n) { - std::cout << indent(depth); + indent(depth); write_color("function_definition ", c_color_decl); print_location(n.loc); std::cout << '\n'; @@ -658,7 +673,7 @@ void text_ast_dumper::dump_func_def(func_def const& n) void text_ast_dumper::dump_decl_stmt(decl_stmt const& n) { - std::cout << indent(depth); + indent(depth); write_color("declaration_statement ", c_color_stmt); print_location(n.loc); std::cout << '\n'; @@ -669,7 +684,7 @@ void text_ast_dumper::dump_decl_stmt(decl_stmt const& n) void text_ast_dumper::dump_assign_stmt(assign_stmt const& n) { - std::cout << indent(depth); + indent(depth); write_color("assign_statement ", c_color_stmt); print_location(n.loc); std::cout << " ("; @@ -685,7 +700,7 @@ void text_ast_dumper::dump_assign_stmt(assign_stmt const& n) void text_ast_dumper::dump_compound_stmt(compound_stmt const& n) { - std::cout << indent(depth); + indent(depth); write_color("compound_statement ", c_color_stmt); print_location(n.loc); std::cout << '\n'; @@ -699,7 +714,7 @@ void text_ast_dumper::dump_compound_stmt(compound_stmt const& n) void text_ast_dumper::dump_ret_stmt(ret_stmt const& n) { - std::cout << indent(depth); + indent(depth); write_color("ret_statement ", c_color_stmt); print_location(n.loc); std::cout << '\n'; @@ -710,7 +725,7 @@ void text_ast_dumper::dump_ret_stmt(ret_stmt const& n) void text_ast_dumper::dump_conditional_stmt(conditional_stmt const& n) { - std::cout << indent(depth); + indent(depth); write_color("conditional_statement ", c_color_stmt); print_location(n.loc); std::cout << '\n'; @@ -724,7 +739,7 @@ void text_ast_dumper::dump_conditional_stmt(conditional_stmt const& n) void text_ast_dumper::dump_iteration_stmt(iteration_stmt const& n) { - std::cout << indent(depth); + indent(depth); write_color("iteration_statement ", c_color_stmt); print_location(n.loc); std::cout << '\n'; @@ -736,7 +751,7 @@ void text_ast_dumper::dump_iteration_stmt(iteration_stmt const& n) void text_ast_dumper::dump_expr_stmt(expr_stmt const& n) { - std::cout << indent(depth); + indent(depth); write_color("expression_statement ", c_color_stmt); print_location(n.loc); std::cout << '\n'; @@ -747,7 +762,7 @@ void text_ast_dumper::dump_expr_stmt(expr_stmt const& n) void text_ast_dumper::dump_null_stmt(stmt const& n) { - std::cout << indent(depth); + indent(depth); write_color("null_statement ", c_color_stmt); print_location(n.loc); std::cout << '\n'; diff --git a/src/ast/ast_dumper.hpp b/src/ast/ast_dumper.hpp index 995b3a5..5132708 100644 --- a/src/ast/ast_dumper.hpp +++ b/src/ast/ast_dumper.hpp @@ -3,18 +3,16 @@ #pragma once #include "ast.hpp" +#include "../shared/text_dumper_base.hpp" -enum class color; - -class text_ast_dumper { - bool has_colors; +class text_ast_dumper : public text_dumper_base { bool show_expr_types; int depth = 0; location_provider* loc_prov; public: text_ast_dumper(bool enable_colors, bool show_expr_types, location_provider* loc_prov = nullptr) - : has_colors(enable_colors), show_expr_types(show_expr_types), loc_prov(loc_prov) + : text_dumper_base(enable_colors), show_expr_types(show_expr_types), loc_prov(loc_prov) {} void dump_ast(ast_root const& root); @@ -32,7 +30,6 @@ class text_ast_dumper { void dump_arguments(arguments const&); void dump_func_call(func_call const&); void dump_id_ref_expr(id_ref_expr const&); - void dump_loperand(loperand const&) = delete; void dump_string_literal(string_literal const&); void dump_integral_literal(integral_literal const&); void dump_nullptr_literal(nullptr_literal const&); @@ -57,6 +54,5 @@ class text_ast_dumper { void dump_null_stmt(stmt const&); // null_stmt contains no further members private: - void write_color(std::string, color); void print_location(location_range); }; diff --git a/src/ast/types.hpp b/src/ast/types.hpp index b6a6fcb..9fcb1c4 100644 --- a/src/ast/types.hpp +++ b/src/ast/types.hpp @@ -1,9 +1,12 @@ /* +Type system v.1 Contains an enum for each base datatypes, could be in ast.hpp, but since it's not an ast_node, I prefer having it here */ #pragma once +#include + // Almost all of the keywords here, are also keywords in C++ // so I for each of them put a _ before. // Also I'm not sure how we're gonna do classes now that all types are in this enum @@ -11,21 +14,113 @@ since it's not an ast_node, I prefer having it here enum class dtypename { + _char, + _byte, _int, _long, _float, _double, - _char, - _byte, _bool, - _none + _none, }; -enum class dtypemod +enum class dtypemod : unsigned char { _ptr, _signed, _unsigned, _const, - _func -}; \ No newline at end of file + _func, +}; + +// Primary class of a type (influenced by both the base type and modifiers) +enum class dtypeclass +{ + _none, + _int, // Integral / character + _float, + _bool, + _ptr, + _func, +}; + +enum class exprcat +{ + unset, // unknown/unassigned type + lval, // lvalue, i.e. has memory location + rval, // rvalue, i.e. a pure value (not tied to any object, can be used as an operand) + error, // result of an invalid operation +}; + +struct basic_type +{ + // Type modifiers 'exttp' are stored in reverse order of declaration, for easier manipulation + // For example, `ptr unsigned char` -> basic_type { .basetp = _char, .exttp = [_unsigned, _ptr] } + dtypename basetp; + std::vector exttp; + // The `basetp` is considered to be the most nested specifier + // All valid types must hold the following invariants + // - `unsigned` & `signed` modifiers may only be applied directly to integral types (see dtypeclass `int`), except for `char` + // - `unsigned` & `signed` modifiers may appear at most once, and are mutually exclusive + // - consecutive `const` modifiers cannot appear in the list + // - `func` & `const` modifiers cannot be applied directly to a `func` specifier + // - `const` cannot be applied to a `none` type, unless preceeded by a `ptr` + // - `none` cannot be the top specifier (least nested), unless as return type of a function + + basic_type(dtypename t = dtypename::_none) + : basetp(t) {} + + dtypeclass to_class() const + { + for (auto m : exttp) + { + if (m == dtypemod::_ptr) + return dtypeclass::_ptr; + if (m == dtypemod::_func) + return dtypeclass::_func; + } + switch (basetp) + { + case dtypename::_int: + case dtypename::_long: + case dtypename::_char: + case dtypename::_byte: + return dtypeclass::_int; + case dtypename::_float: + case dtypename::_double: + return dtypeclass::_float; + case dtypename::_bool: + return dtypeclass::_bool; + case dtypename::_none: + return dtypeclass::_none; + } + } + + bool has_modifier_front(dtypemod mod) const + { + return !exttp.empty() and exttp.front() == mod; + } + + bool has_modifier_back(dtypemod mod) const + { + return !exttp.empty() and exttp.back() == mod; + } + + bool operator==(basic_type const& o) const + { + return basetp == o.basetp and exttp == o.exttp; + } + + bool operator!=(basic_type const& o) const + { + return !operator==(o); + } +}; + +struct integer_value +{ + int64_t val; + + integer_value() = default; + constexpr integer_value(int64_t v) : val(v) {} +}; diff --git a/src/codegen/codegen.hpp b/src/codegen/codegen.hpp index 24746f9..699cc05 100644 --- a/src/codegen/codegen.hpp +++ b/src/codegen/codegen.hpp @@ -23,11 +23,13 @@ class codegen diagnostic_manager& diagnostics; std::string emit_identifier(identifier const&); - std::string emit_qual_identifier(qual_identifier const&); + std::string emit_raw_qual_identifier(raw_qual_identifier const&); + std::string emit_decl_symbol_name(decl const*); std::string emit_datatype(basic_type const&); std::string emit_expr(expr const&); std::string emit_binop(binop const&); std::string emit_unop(unop const&); + std::string emit_alloca_expr(alloca_expr const&); std::string emit_arguments(arguments const&); std::string emit_func_call(func_call const&); std::string emit_id_ref_expr(id_ref_expr const&); diff --git a/src/codegen/funcgen.cpp b/src/codegen/funcgen.cpp index 5d7a8d0..9e14a82 100644 --- a/src/codegen/funcgen.cpp +++ b/src/codegen/funcgen.cpp @@ -54,7 +54,7 @@ std::string codegen::emit_func_decl(func_decl const& node) std::string result; result += emit_datatype(node.result_type); result += ' '; - result += emit_identifier(node.ident); + result += emit_decl_symbol_name(&node); result += emit_parameters(node.params); result += ";\n"; return result; @@ -65,7 +65,7 @@ std::string codegen::emit_func_def(func_def const& node) std::string result; result += emit_datatype(node.result_type); result += ' '; - result += emit_identifier(node.ident); + result += emit_decl_symbol_name(&node); result += emit_parameters(node.params); result += '\n'; result += emit_compound_stmt(*node.body); diff --git a/src/codegen/stmt.cpp b/src/codegen/stmt.cpp index b85ccb7..de92378 100644 --- a/src/codegen/stmt.cpp +++ b/src/codegen/stmt.cpp @@ -33,7 +33,7 @@ std::string codegen::emit_var_decl(var_decl const& node) std::string result; result += emit_datatype(node.type); result += ' '; - result += emit_identifier(node.ident); + result += emit_decl_symbol_name(&node); if (node.init) { result += " = "; diff --git a/src/codegen/valgen.cpp b/src/codegen/valgen.cpp index d277d14..897163c 100644 --- a/src/codegen/valgen.cpp +++ b/src/codegen/valgen.cpp @@ -3,12 +3,17 @@ #include "../ast/types.hpp" #include -std::string codegen::emit_qual_identifier(qual_identifier const& ident) +std::string codegen::emit_raw_qual_identifier(raw_qual_identifier const& ident) { - std::string result; + // Call this only on fully expanded identifiers + #ifndef NDEBUG + if (ident.parts.empty()) + { + return "/* OH NO: Identifier not expanded */"; + } + #endif - // This is even more hacky beyond any belief (pt. 2) - // Should probably (definitely) normalize access path first + std::string result; for (auto id = ident.parts.cbegin(), end = ident.parts.cend() - 1; id != end; ++id) { result += emit_identifier(*id) + "$"; @@ -22,9 +27,16 @@ std::string codegen::emit_identifier(identifier const& ident) return ident.name; } +std::string codegen::emit_decl_symbol_name(decl const* node) +{ + if (node and node->symbol) + return emit_raw_qual_identifier(node->symbol->full_name); + return "/*! spotted unexpanded identifier */"; +} + std::string codegen::emit_id_ref_expr(id_ref_expr const& node) { - return emit_qual_identifier(node.ident); + return emit_decl_symbol_name(node.target); } // This is not finished @@ -63,7 +75,7 @@ std::string codegen::emit_datatype(basic_type const& type) for (auto d : type.exttp) { - switch (static_cast(d)) + switch (d) { case dtypemod::_ptr: result += "*"; @@ -96,7 +108,8 @@ std::string codegen::emit_string_literal(string_literal const& node) std::string codegen::emit_integral_literal(integral_literal const& node) { - return "(" + std::to_string(node.value.val) + ")"; + // Negative values shouldn't cause much problems, so I'm not adding parens + return std::to_string(node.value.val); } std::string codegen::emit_nullptr_literal(nullptr_literal const& node) @@ -139,7 +152,23 @@ std::string codegen::emit_unop(unop const& node) result += "("; result += emit_expr(*node.operand); - result += ") "; + result += ")"; + + return result; +} + +std::string codegen::emit_alloca_expr(alloca_expr const& node) +{ + std::string result; + + // It just works, source: trust me + result += "("; + result += emit_datatype(node.type); + result += ")__builtin_alloca(("; + result += emit_expr(*node.size); + result += ")*sizeof("; + result += emit_datatype(node.alloc_type); + result += "))"; return result; } @@ -175,6 +204,8 @@ std::string codegen::emit_expr(expr const& node) return emit_nullptr_literal(static_cast(node)); case expr_kind::cast: return emit_cast_expr(static_cast(node)); + case expr_kind::alloca: + return emit_alloca_expr(static_cast(node)); } #ifndef NDEBUG return "\n#error Bad expression, this is a bug\n"; diff --git a/src/color.cpp b/src/color.cpp deleted file mode 100644 index bebfee3..0000000 --- a/src/color.cpp +++ /dev/null @@ -1,23 +0,0 @@ -#include "color.hpp" - -std::string apply_color(std::string txt, color c) { - #ifdef __unix__ - std::string result; - // Doesn't care if it's on a VT100 terminal or not - // will do coloring anyway. - result += "\033["; - unsigned int col = static_cast(c); - if ((c & color::bright) != color::blank) - result += std::to_string(90 + (col & 7)); - else - result += std::to_string(30 + (col & 7)); - if ((c & color::bold) != color::blank) - result += ";1"; - result += 'm'; - result += txt; - result += "\033[m"; - return result; - #else - return txt; - #endif -} diff --git a/src/color.hpp b/src/color.hpp deleted file mode 100644 index 0207e4b..0000000 --- a/src/color.hpp +++ /dev/null @@ -1,29 +0,0 @@ -/* -Command-Line coloring, currently works only on linux VT100-based terminals.. -This should be overhaul, rn this is very hacky. -*/ -#pragma once - -#include - -enum class color -{ - blank = 0, - red = 1, - green = 2, - blue = 4, - bright = 8, - bold = 0x10, -}; - -constexpr color operator|(color a, color b) -{ - return static_cast(static_cast(a) | static_cast(b)); -} - -constexpr color operator&(color a, color b) -{ - return static_cast(static_cast(a) & static_cast(b)); -} - -std::string apply_color(std::string txt, color c); diff --git a/src/cmd.cpp b/src/frontend/cmd.cpp similarity index 76% rename from src/cmd.cpp rename to src/frontend/cmd.cpp index eaee1b8..129c87d 100644 --- a/src/cmd.cpp +++ b/src/frontend/cmd.cpp @@ -16,10 +16,14 @@ void cmd::write_help() "--- Utility Options ---\n" "\t-dump-tokens\tDumps the lexer tokens of the source file\n" "\t-dump-ast\tDumps the AST in a human readable view\n" + "\t-dump-syms\tDumps the symbol table of the program\n" + "\t-dump-syms-all\tDumps the symbol table of the program, including local and unnamed symbols\n" "\t-keep-tmp\tKeeps the temporary folder, instead of deleting it after compiling\n" + "\t-no-out-gen\tDon't emit an output file, only check program for correctness\n" "\t-show-unresolved-refs\tShow warnings whether an undefined symbol is referenced" "(no longer needed since semantic analysis already reports errors; currently no-op)\n" "\t-show-expr-types\tShow types in expressions (effective during an AST dump)\n" + "\t-soft-type-checks\tDon't generate errors on type mismatches\n" ; } @@ -69,10 +73,24 @@ cmd parse_cmd(int argc, char *argv[]) { c.dump_ast = true; } + else if (std::strcmp(argv[i], "-dump-syms") == 0) + { + c.dump_syms = true; + c.dump_syms_extra = false; + } + else if (std::strcmp(argv[i], "-dump-syms-all") == 0) + { + c.dump_syms = true; + c.dump_syms_extra = true; + } else if (std::strcmp(argv[i], "-keep-tmp") == 0) { c.keep_tmp = true; } + else if (std::strcmp(argv[i], "-no-out-gen") == 0) + { + c.no_outgen = true; + } else if (std::strcmp(argv[i], "-show-unresolved-refs") == 0) { c.ignore_unresolved_refs = false; @@ -81,6 +99,10 @@ cmd parse_cmd(int argc, char *argv[]) { c.show_expr_types = true; } + else if (std::strcmp(argv[i], "-soft-type-checks") == 0) + { + c.soft_type_checks = true; + } else { if (c.filename.empty()) diff --git a/src/cmd.hpp b/src/frontend/cmd.hpp similarity index 82% rename from src/cmd.hpp rename to src/frontend/cmd.hpp index 9774a4c..b8344de 100644 --- a/src/cmd.hpp +++ b/src/frontend/cmd.hpp @@ -18,10 +18,14 @@ class cmd // Options bool dump_tkns = false; bool dump_ast = false; + bool dump_syms = false; + bool dump_syms_extra = false; bool keep_tmp = false; + bool no_outgen = false; bool has_color = true; bool ignore_unresolved_refs = true; bool show_expr_types = false; + bool soft_type_checks = false; std::string filename; void write_help(); diff --git a/src/frontend/frontend.cpp b/src/frontend/frontend.cpp index 4a4d5c8..40c2502 100644 --- a/src/frontend/frontend.cpp +++ b/src/frontend/frontend.cpp @@ -1,6 +1,7 @@ #include "frontend.hpp" #include "fs.hpp" +#include "os.hpp" void frontend::make_tmp_folder() { @@ -12,26 +13,44 @@ bool frontend::find_compiler() this->has_gcc = false; this->has_clang = false; - #if defined(__unix__) || defined(__APPLE_CC__) - // This is utterly terrible - if (system("which gcc > /dev/null 2>&1") == 0) + #if defined(CHIRP_PLATFORM_UNIX) + // This is slightly less utterly terrible + char const* cmd[] = {"which", /* prog */nullptr, nullptr}; + file_open_descriptor descs[] = { + {0, OS_FILE_DEVNULL}, + {1, OS_FILE_DEVNULL}, + {2, OS_FILE_DEVNULL}, + {-1, -1} + }; + cmd[1] = "gcc"; + if (proc_exec(cmd, descs) == 0) { this->has_gcc = true; return true; } - else if (system("which clang > /dev/null 2>&1") == 0) + cmd[1] = "clang"; + if (proc_exec(cmd, descs) == 0) { this->has_clang = true; return true; } - #elif defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__) - // This is utterly terrible - if (system("where.exe gcc > nul") == 0) + #elif defined(CHIRP_PLATFORM_WINNT) + // This is slightly less utterly terrible + char const* cmd[] = {"where", /* prog */nullptr, nullptr}; + file_open_descriptor descs[] = { + {0, OS_FILE_DEVNULL}, + {1, OS_FILE_DEVNULL}, + {2, OS_FILE_DEVNULL}, + {-1, -1} + }; + cmd[1] = "gcc"; + if (proc_exec(cmd, descs) == 0) { this->has_gcc = true; return true; } - else if (system("where.exe clang > nul") == 0) + cmd[1] = "clang"; + if (proc_exec(cmd, descs) == 0) { this->has_clang = true; return true; diff --git a/src/main.cpp b/src/frontend/main.cpp similarity index 60% rename from src/main.cpp rename to src/frontend/main.cpp index b56cd41..8e2b836 100644 --- a/src/main.cpp +++ b/src/frontend/main.cpp @@ -1,12 +1,12 @@ // This manages to pretty much interface all components with eachothers #include "cmd.hpp" -#include "color.hpp" -#include "lexer/lexer.hpp" -#include "parser/parser.hpp" -#include "codegen/codegen.hpp" -#include "frontend/frontend.hpp" -#include "ast/ast_dumper.hpp" -#include "seman/analyser.hpp" +#include "../lexer/lexer.hpp" +#include "../parser/parser.hpp" +#include "../codegen/codegen.hpp" +#include "../frontend/frontend.hpp" +#include "../ast/ast_dumper.hpp" +#include "../seman/analyser.hpp" +#include "../seman/sym_dumper.hpp" #include #include #include @@ -34,7 +34,7 @@ int main(int argc, char** argv) if (options.error) { - std::cout << "Error in provided arguments\n"; + std::cerr << "Error in provided arguments\n"; return -1; } @@ -44,7 +44,7 @@ int main(int argc, char** argv) if (!f) { - std::cout << "Can't open file: \"" << options.filename << "\"\n"; + std::cerr << "Can't open file: \"" << options.filename << "\"\n"; return -1; } @@ -72,9 +72,10 @@ int main(int argc, char** argv) { std::cout << "Tokens:\n"; + location_run run; for (token &t : tkns) { - std::cout << t.util_dump() << '\n'; + std::cout << t.util_dump(&run) << '\n'; } } // Parsing @@ -85,13 +86,14 @@ int main(int argc, char** argv) // Semantic analysis analyser seman(p.get_ast(), diagnostics); + seman.soft_type_checks = options.soft_type_checks; seman.analyse(); if (options.dump_ast) { if (options.dump_tkns) { - std::cout << "--------------------" << '\n'; + std::cout << "--------------------\n"; } text_ast_dumper dumper(options.has_color, options.show_expr_types, &p); @@ -100,37 +102,28 @@ int main(int argc, char** argv) std::cout << '\n'; } - if (diagnostics.error) - { - std::cerr << "Compilation aborted\n"; - return -1; - } - - frontend frontend; - - if (!frontend.find_compiler()) + if (options.dump_syms) { - if (options.has_color) - { - std::cerr << apply_color("[TOOL MISSING] ", color::red); - } - else + if (options.dump_tkns or options.dump_ast) { - std::cerr << "[TOOL MISSING] "; + std::cout << "--------------------\n"; } - std::cerr << "Couldn't find supported C compiler on this machine.\n"; - std::cerr << "Supported compilers are clang and gcc\n"; - std::cerr << "To specify C compiler use option -compiler-path, and then the path to the compiler.\n"; + text_symbol_dumper dumper(options.has_color, options.dump_syms_extra, seman.get_tracker()); + dumper.dump_symbols(); + std::cout << '\n'; + } + + if (diagnostics.error) + { + std::cerr << "Compilation aborted\n"; return -1; } // Code Generation codegen generator(diagnostics); - auto t = std::make_unique(diagnostics); - generator.set_tree(&p.get_ast(), options.filename); generator.gen(); @@ -140,17 +133,34 @@ int main(int argc, char** argv) return -1; } - frontend.make_tmp_folder(); + // Outputting generated content + frontend frontend; - frontend.write_out("dump", generator.get_result()); + if (!options.no_outgen) + { + if (!frontend.find_compiler()) + { + print_color("[TOOL MISSING] ", options.has_color, std::cerr, color::red); - // Tooling - // (use the compiler) + std::cerr << "Couldn't find supported C compiler on this machine.\n"; + std::cerr << "Supported compilers are clang and gcc\n"; + std::cerr << "To specify C compiler use option -compiler-path, and then the path to the compiler.\n"; - // Cleanup - if (!options.keep_tmp) - { - frontend.remove_tmp_folder(); + return -1; + } + + frontend.make_tmp_folder(); + + frontend.write_out("dump", generator.get_result()); + + // Tooling + // (use the compiler) + + // Cleanup + if (!options.keep_tmp) + { + frontend.remove_tmp_folder(); + } } return 0; diff --git a/src/frontend/os.cpp b/src/frontend/os.cpp new file mode 100644 index 0000000..74abe3f --- /dev/null +++ b/src/frontend/os.cpp @@ -0,0 +1,190 @@ +#include "os.hpp" + +#include +#include + +#if defined(CHIRP_PLATFORM_UNIX) +#include +#include +#include +#include + +int proc_exec(char const* const argv[], file_open_descriptor const descs[], bool use_path) +{ + pid_t child_pid = fork(); + if (child_pid == -1) + { + std::perror("Warning: fork failed"); + return -1; + } + else if (child_pid == 0) + { + std::vector descriptors; + int nullfd = -1, maxfd = 0, max_used_fd; + for (auto* descriptor = descs; descriptor->target_fd != -1; ++descriptor) + { + if (descriptor->target_fd >= 0) + maxfd = std::max(maxfd, descriptor->target_fd); + } + max_used_fd = maxfd; + for (auto* descriptor = descs; descriptor->target_fd != -1; ++descriptor) + { + if (descriptor->target_fd >= descriptors.size()) + descriptors.resize(descriptor->target_fd + 1, OS_FILE_CLOSED); + int fd = descriptor->source_fd; + if (fd == OS_FILE_DEVNULL) + { + if (nullfd == -1) + { + nullfd = open("/dev/null", O_RDWR); + if (nullfd != -1) + { + dup2(nullfd, ++maxfd); + nullfd = maxfd; + } + } + fd = nullfd; + } + descriptors[descriptor->target_fd] = fd; + } + // TODO: Logic for keeping track of fd clobbers is too complicated for now + int fd_dst = 0; + for (int& fd_src : descriptors) + { + if (fd_src == OS_FILE_CLOSED) + close(fd_dst); + else + dup2(fd_src, fd_dst); + ++fd_dst; + } + closefrom(max_used_fd); + if (use_path) + execvp(argv[0], const_cast(argv)); + else + execv(argv[0], const_cast(argv)); + chirp_unreachable("Process not executed"); + } + + int status; + if (waitpid(child_pid, &status, 0) == -1) + { + std::perror("Warning: wait failed"); + return -2; + } + return WEXITSTATUS(status); +} +#elif defined(CHIRP_PLATFORM_WINNT) +#include + +#include +#include + +// Adapted from https://stackoverflow.com/questions/2611044/process-start-pass-html-code-to-exe-as-argument/2611075#2611075 +static void EscapeBackslashes(std::string& sb, char const* s, char const* begin) +{ + // Backslashes must be escaped if and only if they precede a double quote. + while (*s == '\\') + { + sb += '\\'; + --s; + + if (s == begin) + break; + } +} + +static std::string ArgvToCommandLine(char const* const args[]) +{ + std::string sb; + for (auto* s = *args; s; ++args) + { + auto* const sbeg = s; + size_t s_len = std::strlen(s); + auto* const send = s + s_len; + sb += '"'; + // Escape double quotes (") and backslashes (\). + while (true) + { + // Put this test first to support zero length strings. + if (s >= send) + break; + + auto* quote = std::strchr(s, '"'); + if (quote == nullptr) + break; + + sb.append(s, quote); + EscapeBackslashes(sb, quote - 1, s); + sb += "\\\""; + s = quote + 1; + } + sb.append(s, send); + EscapeBackslashes(sb, send - 1, sbeg); + sb += "\" "; + } + return sb; +} + +int proc_exec(char const* const argv[], file_open_descriptor const descs[], bool use_path) +{ + HANDLE handles[3] {}; + for (auto* descriptor = descs; descriptor->target_fd != -1; ++descriptor) + { + if (descriptor->target_fd > 2) + { + std::fputs("Warning: Windows process exec target fd > 2\n", stderr); + return -1; + } + + HANDLE handle = nullptr; + switch (descriptor->source_fd) + { + case OS_FILE_CLOSED: + case OS_FILE_DEVNULL: + break; + + case 0: + handle = GetStdHandle(STD_INPUT_HANDLE); + break; + case 1: + handle = GetStdHandle(STD_OUTPUT_HANDLE); + break; + case 2: + handle = GetStdHandle(STD_ERROR_HANDLE); + break; + default: + std::fputs("Warning: Windows process exec source fd > 2\n", stderr); + return -1; + } + handles[descriptor->target_fd] = handle; + } + std::string cmd_line = ArgvToCommandLine(argv); + STARTUPINFOA startup_info{}; + startup_info.cb = sizeof startup_info; + startup_info.dwFlags = STARTF_USESTDHANDLES; + startup_info.hStdInput = handles[0]; + startup_info.hStdOutput = handles[1]; + startup_info.hStdError = handles[2]; + + PROCESS_INFORMATION proc_info{}; + + if (!CreateProcessA(argv[0], cmd_line.c_str(), nullptr, nullptr, false, 0, nullptr, nullptr, &start_info, &proc_info)) + { + return -1; + } + + // Wait until child process exits. + WaitForSingleObject(proc_info.hProcess, INFINITE); + + int exit_code = -2; + GetExitCodeProcess(proc_info.hProcess, &exit_code); + + // Close process and thread handles. + CloseHandle(proc_info.hProcess); + CloseHandle(proc_info.hThread); + + return exit_code; +} +#else +#error Unknown platform +#endif diff --git a/src/frontend/os.hpp b/src/frontend/os.hpp new file mode 100644 index 0000000..557e470 --- /dev/null +++ b/src/frontend/os.hpp @@ -0,0 +1,22 @@ +/// \file OS interface + +#pragma once + +#include "../shared/system.hpp" + +#define OS_FILE_CLOSED -1 +#define OS_FILE_DEVNULL -2 + +struct file_open_descriptor +{ + int target_fd; + int source_fd; // can be OS_FILE_... constant +}; + +/** + * @param argv must be a NULL-terminated array of strings + * @param descs must be an array of descriptors terminated with a descriptor whose target_fd is -1 + * @param use_path whether to use PATH to search for the program (if true), or to use the first argument as a file path directly (if false) + * @return process exit code + */ +int proc_exec(char const* const argv[], file_open_descriptor const descs[], bool use_path = true); diff --git a/src/lexer/lexer.cpp b/src/lexer/lexer.cpp index 9545819..ea5d761 100644 --- a/src/lexer/lexer.cpp +++ b/src/lexer/lexer.cpp @@ -1,5 +1,7 @@ #include "lexer.hpp" +#include "../shared/system.hpp" + #include #include @@ -143,18 +145,18 @@ std::vector lexer::preprocess(std::vector const& raw_tokens) platform target_platform; // Note: Linux should always be before the Unix macro -#if defined(_WIN64) || defined(_WIN32) || defined(__WINDOWS__) +#if CHIRP_SUBPLATFORM == CHIRP_PLATFORMID_WINNT target_platform = platform::WINDOWS; -#elif defined(__linux) || defined(linux) || defined(__linux__) +#elif CHIRP_SUBPLATFORM == CHIRP_PLATFORMID_LINUX target_platform = platform::LINUX; -#elif defined(__DragonFly__) || defined(__FreeBSD) +#elif CHIRP_SUBPLATFORM == CHIRP_PLATFORMID_BSD target_platform = platform::BSD; -#elif defined(__APPLE__) || defined(macintosh) || defined(__MACH__) +#elif CHIRP_SUBPLATFORM == CHIRP_PLATFORMID_APPLE target_platform = platform::OSX; -#elif defined(__unix) || defined(unix) +#elif CHIRP_SUBPLATFORM == CHIRP_PLATFORMID_UNIX target_platform = platform::UNIX; #else - target_platform = platform::UNKOWN; + target_platform = platform::UNKNOWN; #endif std::vector result; @@ -382,6 +384,7 @@ std::vector lexer::lex(std::vector const& src) MATCH_KW(true) MATCH_KW(false) MATCH_KW(null) + MATCH_KW(alloca) #undef MATCH_KW // Types #define MATCH_DT(v) MATCH(#v, tkn_type::dt_##v) @@ -402,7 +405,8 @@ std::vector lexer::lex(std::vector const& src) #undef MATCH_DM #undef MATCH // Symbols - /*else*/ if (t.loc.len == 1) + /* else */ + if (t.loc.len == 1) { switch (t.value.at(0)) { @@ -414,6 +418,8 @@ std::vector lexer::lex(std::vector const& src) CASE(':', tkn_type::colon) CASE(',', tkn_type::comma) CASE('=', tkn_type::assign_op) + CASE('<', tkn_type::lt_op) + CASE('>', tkn_type::gt_op) CASE('+', tkn_type::plus_op) CASE('-', tkn_type::minus_op) CASE('*', tkn_type::star_op) diff --git a/src/lexer/token.cpp b/src/lexer/token.cpp index 4c43ef6..70b9688 100644 --- a/src/lexer/token.cpp +++ b/src/lexer/token.cpp @@ -25,6 +25,7 @@ char const* token_names[] = { tknstr(kw_true ) tknstr(kw_false ) tknstr(kw_null ) + tknstr(kw_alloca ) tknstr(dt_int ) tknstr(dt_char ) tknstr(dt_float ) @@ -73,24 +74,15 @@ char const* token_names[] = { static_assert(std::size(token_names) == static_cast(tkn_type::eof) + 1, "Size of token names array doesn't match number of tokens"); -std::string token::util_dump(){ +std::string token::util_dump(location_run* run) +{ std::string result; result += token_names[static_cast(this->type)]; result += " '"; result += this->value; - result += "' <"; - result += loc.filename; - result += ":"; - if (loc.line == -1) - result += "invalid"; - else - result += std::to_string(loc.line+1); - result += ":"; - if (loc.start == -1) - result += "invalid"; - else - result += std::to_string(loc.start+1); + result += "' <"; + print_loc_single(loc, result, run); result += ">"; return result; } diff --git a/src/lexer/token.hpp b/src/lexer/token.hpp index bbb9879..de97f0c 100644 --- a/src/lexer/token.hpp +++ b/src/lexer/token.hpp @@ -18,6 +18,8 @@ enum class tkn_type kw_true, // true kw_false, // false kw_null, // null + // {:>> + kw_alloca, // alloca // Tokens with multiple keywords // Types @@ -47,11 +49,11 @@ enum class tkn_type compassign_op, // #= where the next token corresponds to the operation lt_op, gt_op, lteq_op, gteq_op, eqeq_op, noteq_op, // > < <= >= == != plus_op, minus_op, star_op, slash_op, perc_op, // + - * / % - as_op, // as cmp_S = lt_op, - cmd_E = noteq_op, + cmp_E = noteq_op, binop_S = lt_op, - binop_E = as_op, + binop_E = perc_op, + as_op, // as ref_op, deref_op, // ref deref (unary) // + & - can be unary lparen, rparen, // ( ) @@ -78,8 +80,8 @@ class token location loc; // Utility Function - std::string util_dump(); + std::string util_dump(location_run* run); }; -std::string exprop_id(tkn_type op); +char const* exprop_id(tkn_type op); extern char const* token_names[]; diff --git a/src/parser/expr.cpp b/src/parser/expr.cpp index 81dc6c0..4b2dbc8 100644 --- a/src/parser/expr.cpp +++ b/src/parser/expr.cpp @@ -34,6 +34,7 @@ static int get_operator_precedence(tkn_type op) case tkn_type::deref_op: return static_cast(precedence_class::ref); case tkn_type::as_op: + case tkn_type::kw_alloca: return static_cast(precedence_class::as); case tkn_type::star_op: case tkn_type::slash_op: @@ -87,7 +88,8 @@ exprh parser::parse_subexpr_op(exprh lhs, int max_prec) basic_type newtp = parse_datatype(); if (has_paren) expect(tkn_type::rparen); - auto ecast = new_node(); + auto ecast = new_node(cast_kind::_explicit); + ecast->loc = location_range(lhs->loc.begin, loc_peekb()); ecast->operand = std::move(lhs); ecast->type = std::move(newtp); ecast->cat = exprcat::rval; @@ -117,6 +119,21 @@ exprh parser::parse_unary_expr() // Integral promotion (for now, no-op) skip(); return parse_unary_expr(); + case tkn_type::kw_alloca: // Huehuehue + { + auto lop = loc_peek(); + basic_type type = dtypename::_byte; + skip(); + if (match(tkn_type::lparen)) { + type = parse_datatype(); + expect(tkn_type::rparen); + } + exprh size = parse_unary_expr(); + auto lend = size->loc.end; + auto node = new_node(std::move(type), std::move(size)); + node->loc = location_range(lop, lend); + return node; + } case tkn_type::minus_op: // Negation case tkn_type::ref_op: // Address-of case tkn_type::deref_op: // Dereference diff --git a/src/parser/parser.cpp b/src/parser/parser.cpp index 2ebb037..38000dc 100644 --- a/src/parser/parser.cpp +++ b/src/parser/parser.cpp @@ -62,6 +62,11 @@ void parser::parse_top_level() } default: { + if (is_type()) + { + this->tree.top_decls.push_back(parse_var_decl()); + break; + } this->ok = false; diagnostic(diagnostic_type::location_err) .at(loc_peek()) diff --git a/src/parser/syntax.cpp b/src/parser/syntax.cpp index 32c14f5..09a6a52 100644 --- a/src/parser/syntax.cpp +++ b/src/parser/syntax.cpp @@ -57,7 +57,7 @@ nullptr_literal parser::build_null_ptr_lit(token_location loc) { nullptr_literal node; node.type.basetp = dtypename::_none; - node.type.exttp.push_back(static_cast(dtypemod::_ptr)); + node.type.exttp.push_back(dtypemod::_ptr); node.cat = exprcat::rval; node.loc = loc; return node; @@ -193,6 +193,11 @@ nodeh parser::parse_namespace() } default: { + if (is_type()) + { + node->decls.push_back(parse_var_decl()); + break; + } this->ok = false; diagnostic(diagnostic_type::location_err) .at(loc_peek()) @@ -220,6 +225,7 @@ nodeh parser::parse_extern() auto node = new_node(); node->loc = loc_peekb(); expect(tkn_type::literal); + node->name_loc = loc_peekb(); node->real_name = peekb().value; node->real_name.erase(0, 1); node->real_name.pop_back(); diff --git a/src/parser/var.cpp b/src/parser/var.cpp index 61ce988..f43e00f 100644 --- a/src/parser/var.cpp +++ b/src/parser/var.cpp @@ -24,11 +24,7 @@ static dtypename get_dtypename(tkn_type tok) default: ; } // Unknown type token - #ifndef NDEBUG - std::abort(); - #else - __builtin_unreachable(); - #endif + chirp_unreachable("get_dtypename"); } static dtypemod get_dtypemod(tkn_type tok) @@ -47,11 +43,7 @@ static dtypemod get_dtypemod(tkn_type tok) default: ; } // Unknown type token - #ifndef NDEBUG - std::abort(); - #else - __builtin_unreachable(); - #endif + chirp_unreachable("get_dtypemod"); } bool parser::is_datatype() @@ -81,9 +73,9 @@ basic_type parser::parse_datatype() // Mods before the typename while (is_datamod()) { - type.exttp.push_back(static_cast(get_dtypemod(peek().type))); + type.exttp.push_back(get_dtypemod(peek().type)); - if (static_cast(get_dtypemod(peek().type)) == dtypemod::_ptr) + if (get_dtypemod(peek().type) == dtypemod::_ptr) has_candidate = true; skip(); } diff --git a/src/seman/analyser.cpp b/src/seman/analyser.cpp index a0a03f5..93d42a8 100644 --- a/src/seman/analyser.cpp +++ b/src/seman/analyser.cpp @@ -18,11 +18,9 @@ void analyser::analyse() for (auto& d : root.fdefs) visit_func_def(*d); #endif - // Top scope - sym_tracker.push_scope(); + // Top scope (already pushed) for (auto& d : root.top_decls) visit_decl(*d); - sym_tracker.pop_scope(); } void analyser::visit_expr(expr& node) @@ -45,6 +43,8 @@ void analyser::visit_expr(expr& node) return visit_nullptr_literal(static_cast(node)); case expr_kind::cast: return visit_cast_expr(static_cast(node)); + case expr_kind::alloca: + return visit_alloca_expr(static_cast(node)); } } @@ -102,33 +102,145 @@ void analyser::visit_binop(binop& node) visit_expr(*node.left); visit_expr(*node.right); - if (node.left->cat == exprcat::error or node.right->cat == exprcat::error) + if (node.left->has_error() or node.right->has_error()) { node.cat = exprcat::error; + return; } - else if (node.left->type != node.right->type) + + node.left = convert_to_rvalue(std::move(node.left)); + node.right = convert_to_rvalue(std::move(node.right)); + if (node.left->cat == exprcat::error or node.right->cat == exprcat::error) { node.cat = exprcat::error; - - diagnostic(diagnostic_type::location_err) - .at(node.op_loc) - .reason("Operand types don't match") - .report(diagnostics); + return; } - else + + if (node.op >= tkn_type::cmp_S and node.op <= tkn_type::cmp_E) { - // TODO: Do deeper type analysis - node.cat = exprcat::rval; - if (node.op >= tkn_type::cmp_S - and node.op <= tkn_type::cmd_E) + if (node.left->type != node.right->type) { - node.type.basetp = dtypename::_bool; + auto& t_lhs = node.left->type; + auto& t_rhs = node.right->type; + dtypeclass c_lhs = t_lhs.to_class(), c_rhs = t_rhs.to_class(); + if (c_lhs != c_rhs) + goto _bad_types; + else if (c_lhs == dtypeclass::_int) + { + basic_type common_type(dtypename::_int); + if (t_lhs.basetp == dtypename::_long or t_rhs.basetp == dtypename::_long) + common_type.basetp = dtypename::_long; + if (t_lhs.has_modifier_front(dtypemod::_unsigned) or t_rhs.has_modifier_front(dtypemod::_unsigned)) + common_type.exttp.push_back(dtypemod::_unsigned); + else if (t_lhs.has_modifier_front(dtypemod::_signed) or t_rhs.has_modifier_front(dtypemod::_signed)) + common_type.exttp.push_back(dtypemod::_signed); + node.left = perform_implicit_conversions( + std::move(node.left), common_type, exprcat::rval); + node.right = perform_implicit_conversions( + std::move(node.right), common_type, exprcat::rval); + if (node.left->has_error() or node.right->has_error()) + { + node.cat = exprcat::error; + return; + } + } + else if (c_lhs == dtypeclass::_float) + { + if (t_lhs.basetp != dtypename::_double and t_rhs.basetp != dtypename::_double) + chirp_unreachable("Something's wrong, types don't match and neither value is a double"); + basic_type common_type(dtypename::_double); + node.left = perform_implicit_conversions( + std::move(node.left), common_type, exprcat::rval); + node.right = perform_implicit_conversions( + std::move(node.right), common_type, exprcat::rval); + if (node.left->has_error() or node.right->has_error()) + { + node.cat = exprcat::error; + return; + } + } + else if (c_lhs == dtypeclass::_ptr) + { + // For now, let them be + diagnostic(diagnostic_type::location_warning) + .at(node.loc) + .reason("Pointer operands have different types") + .report(diagnostics); + } } - else + node.cat = exprcat::rval; + node.type.basetp = dtypename::_bool; + } + else + { + // TODO: Support pointer arithmetic + if (node.left->type != node.right->type) { - node.type = node.left->type; + auto& t_lhs = node.left->type; + auto& t_rhs = node.right->type; + dtypeclass c_lhs = t_lhs.to_class(), c_rhs = t_rhs.to_class(); + // Promote to int + if (c_lhs == dtypeclass::_bool) + c_lhs = dtypeclass::_int; + if (c_rhs == dtypeclass::_bool) + c_rhs = dtypeclass::_int; + if (c_lhs != c_rhs) + goto _bad_types; + else if (c_lhs == dtypeclass::_int) + { + basic_type common_type(dtypename::_int); + if (t_lhs.basetp == dtypename::_long or t_rhs.basetp == dtypename::_long) + common_type.basetp = dtypename::_long; + if (t_lhs.has_modifier_front(dtypemod::_unsigned) or t_rhs.has_modifier_front(dtypemod::_unsigned)) + common_type.exttp.push_back(dtypemod::_unsigned); + else if (t_lhs.has_modifier_front(dtypemod::_signed) or t_rhs.has_modifier_front(dtypemod::_signed)) + common_type.exttp.push_back(dtypemod::_signed); + node.left = perform_implicit_conversions( + std::move(node.left), common_type, exprcat::rval); + node.right = perform_implicit_conversions( + std::move(node.right), common_type, exprcat::rval); + if (node.left->has_error() or node.right->has_error()) + { + node.cat = exprcat::error; + return; + } + } + else if (c_lhs == dtypeclass::_float) + { + if (t_lhs.basetp != dtypename::_double and t_rhs.basetp != dtypename::_double) + chirp_unreachable("Something's wrong, types don't match and neither value is a double"); + basic_type common_type(dtypename::_double); + node.left = perform_implicit_conversions( + std::move(node.left), common_type, exprcat::rval); + node.right = perform_implicit_conversions( + std::move(node.right), common_type, exprcat::rval); + if (node.left->has_error() or node.right->has_error()) + { + node.cat = exprcat::error; + return; + } + } + else if (c_lhs == dtypeclass::_ptr) + { + diagnostic(diagnostic_type::location_err) + .at(node.loc) + .reason("Arithmetic is not supported on pointers") + .report(diagnostics); + node.cat = exprcat::error; + return; + } } + node.cat = exprcat::rval; + node.type = node.left->type; } + return; + + _bad_types: + node.cat = exprcat::error; + diagnostic(!soft_type_checks ? diagnostic_type::location_err : diagnostic_type::location_warning) + .at(node.op_loc) + .reason("Operand types don't match") + .report(diagnostics); } void analyser::visit_unop(unop& node) @@ -144,19 +256,43 @@ void analyser::visit_unop(unop& node) switch (node.op) { case tkn_type::plus_op: + { + auto& operand = *node.operand; node.cat = exprcat::rval; - node.type = node.operand->type; // Do promotion + auto promoted_expr = promote_value(std::move(node.operand)); + if (promoted_expr->has_error()) + node.cat = exprcat::error; + node.type = promoted_expr->type; + node.operand = std::move(promoted_expr); break; + } case tkn_type::minus_op: + { + auto& operand = *node.operand; node.cat = exprcat::rval; - node.type = node.operand->type; - // TODO: Assert type is an integer, do promotion + // Check that the type is an integer or a float, do promotion + auto promoted_expr = promote_value(std::move(node.operand)); + if (promoted_expr->has_error()) + node.cat = exprcat::error; + else if (promoted_expr->type.to_class() != dtypeclass::_int and + promoted_expr->type.to_class() != dtypeclass::_float) + { + diagnostic(diagnostic_type::location_err) + .at(node.loc) + .reason("Cannot negate a non-number") + .report(this->diagnostics); + node.cat = exprcat::error; + break; + } + node.type = promoted_expr->type; + node.operand = std::move(promoted_expr); break; + } case tkn_type::ref_op: if (node.operand->cat != exprcat::lval) { - diagnostic(diagnostic_type::location_err) + diagnostic(!soft_type_checks ? diagnostic_type::location_err : diagnostic_type::location_warning) .at(node.loc) .reason("Cannot take address of a non-lvalue") .report(diagnostics); @@ -165,12 +301,12 @@ void analyser::visit_unop(unop& node) } node.cat = exprcat::rval; node.type = node.operand->type; - node.type.exttp.push_back(static_cast(dtypemod::_ptr)); + node.type.exttp.push_back(dtypemod::_ptr); break; case tkn_type::deref_op: - if (node.operand->type.exttp.empty() or node.operand->type.exttp.back() != static_cast(dtypemod::_ptr)) + if (node.operand->type.exttp.empty() or node.operand->type.exttp.back() != dtypemod::_ptr) { - diagnostic(diagnostic_type::location_err) + diagnostic(!soft_type_checks ? diagnostic_type::location_err : diagnostic_type::location_warning) .at(node.loc) .reason("Cannot dereference a non-pointer") .report(diagnostics); @@ -182,7 +318,7 @@ void analyser::visit_unop(unop& node) node.type.exttp.pop_back(); break; default: - std::abort(); + chirp_unreachable("visit_unop"); } } @@ -197,10 +333,9 @@ void analyser::visit_arguments(arguments& node) void analyser::visit_func_call(func_call& node) { visit_expr(*node.callee); - visit_arguments(node.args); node.type = node.callee->type; - if (!node.type.exttp.empty() and static_cast(node.type.exttp.back()) == dtypemod::_func) + if (!node.type.exttp.empty() and node.type.exttp.back() == dtypemod::_func) { node.type.exttp.pop_back(); node.cat = exprcat::rval; @@ -208,36 +343,39 @@ void analyser::visit_func_call(func_call& node) else { node.cat = exprcat::error; - diagnostic(diagnostic_type::location_err) + diagnostic(!soft_type_checks ? diagnostic_type::location_err : diagnostic_type::location_warning) .at(node.loc) .reason("Cannot call non-function") .report(diagnostics); } + + visit_arguments(node.args); } void analyser::visit_id_ref_expr(id_ref_expr& node) { //! Point of interest - if (auto sym = sym_tracker.lookup_sym_qual(node.ident)) + if (auto* sym = sym_tracker.lookup_sym_qual(node.ident)) { - while (sym->kind == decl_kind::external) - sym = static_cast(sym)->inner_decl.get(); - node.target = sym; - switch (sym->kind) + auto const* decl = sym->target; + while (decl->kind == decl_kind::external) + decl = static_cast(decl)->inner_decl.get(); + node.target = decl; + switch (decl->kind) { case decl_kind::var: - node.type = static_cast(*sym).type; + node.type = static_cast(*decl).type; node.cat = exprcat::lval; break; case decl_kind::fdecl: case decl_kind::fdef: - node.type = static_cast(*sym).result_type; + node.type = static_cast(*decl).result_type; node.cat = exprcat::rval; - node.type.exttp.push_back(static_cast(dtypemod::_func)); + node.type.exttp.push_back(dtypemod::_func); break; default: node.cat = exprcat::error; - diagnostic(diagnostic_type::location_err) + diagnostic(!soft_type_checks ? diagnostic_type::location_err : diagnostic_type::location_warning) .at(node.loc) .reason("Unknown declaration type") .report(diagnostics); @@ -255,8 +393,8 @@ void analyser::visit_string_literal(string_literal& node) return; node.cat = exprcat::rval; node.type.basetp = dtypename::_char; - node.type.exttp.push_back(static_cast(dtypemod::_const)); - node.type.exttp.push_back(static_cast(dtypemod::_ptr)); + node.type.exttp.push_back(dtypemod::_const); + node.type.exttp.push_back(dtypemod::_ptr); } void analyser::visit_integral_literal(integral_literal& node) @@ -278,12 +416,324 @@ void analyser::visit_cast_expr(cast_expr& node) visit_expr(*node.operand); } +void analyser::visit_alloca_expr(alloca_expr& node) +{ + basic_type size_type(dtypename::_long); + size_type.exttp.push_back(dtypemod::_unsigned); + auto promoted_size = perform_implicit_conversions( + std::move(node.size), size_type, exprcat::rval); + if (promoted_size->cat == exprcat::error) + { + node.cat = exprcat::error; + } + else + { + node.cat = exprcat::rval; + node.type.basetp = dtypename::_none; + node.type.exttp.push_back(dtypemod::_ptr); + } + node.size = std::move(promoted_size); + node.type = node.alloc_type; + node.type.exttp.push_back(dtypemod::_ptr); +} + +exprh analyser::convert_to_rvalue(exprh source) +{ + basic_type type = source->type; + if (type.has_modifier_back(dtypemod::_const)) + type.exttp.pop_back(); + return perform_implicit_conversions( + std::move(source), type, exprcat::rval); +} + +exprh analyser::promote_value(exprh source) +{ + source = convert_to_rvalue(std::move(source)); + if (source->has_error()) + return source; + + basic_type new_type = source->type; + if (new_type.to_class() == dtypeclass::_int) + { + if (new_type.basetp != dtypename::_long) + new_type.basetp = dtypename::_int; + } + else if (new_type.to_class() == dtypeclass::_float) + { + // Nothing to be done + } + + return perform_implicit_conversions( + std::move(source), new_type, exprcat::rval); +} + +// This function tries to perform implicit conversions to convert an expression to the desired type +// On failure to do so, returns null +exprh analyser::perform_implicit_conversions(exprh source, basic_type const& target_type, exprcat target_cat) +{ + dtypeclass cl_src = source->type.to_class(), cl_dst = target_type.to_class(); + if (cl_src == cl_dst) + { + // Check if types are equal + if (target_type != source->type) + { + // First, remove const if necessary... + if (source->type.has_modifier_back(dtypemod::_const)) + { + auto const_cast_ = new_node(cast_kind::_const); + const_cast_->loc = source->loc; + const_cast_->type.basetp = source->type.basetp; + for (auto mod : source->type.exttp) + if (mod != dtypemod::_const) + const_cast_->type.exttp.push_back(mod); + const_cast_->cat = exprcat::rval; + const_cast_->operand = std::move(source); + source = std::move(const_cast_); + } + + // Integral conversions + // Convert between different operand sizes + // dtypeclass none shouldn't be passed in the first place + if (cl_dst == dtypeclass::_int) + { + // For now, just convert directly to the target type, then match signedness + + if (source->type.basetp != target_type.basetp) + { + auto grade_cast = new_node(cast_kind::_grade); + grade_cast->loc = source->loc; + grade_cast->type.basetp = target_type.basetp; + for (auto mod : source->type.exttp) + grade_cast->type.exttp.push_back(mod); + grade_cast->cat = exprcat::rval; + grade_cast->operand = std::move(source); + source = std::move(grade_cast); + } + + if (target_type.has_modifier_back(dtypemod::_signed) and !source->type.has_modifier_back(dtypemod::_signed)) + { + auto sign_cast = new_node(cast_kind::_sign); + sign_cast->loc = source->loc; + sign_cast->type.basetp = target_type.basetp; + sign_cast->type.exttp.push_back(dtypemod::_signed); + sign_cast->cat = exprcat::rval; + sign_cast->operand = std::move(source); + source = std::move(sign_cast); + } + else if (target_type.has_modifier_back(dtypemod::_unsigned) and !source->type.has_modifier_back(dtypemod::_unsigned)) + { + auto sign_cast = new_node(cast_kind::_sign); + sign_cast->loc = source->loc; + sign_cast->type.basetp = target_type.basetp; + sign_cast->type.exttp.push_back(dtypemod::_unsigned); + sign_cast->cat = exprcat::rval; + sign_cast->operand = std::move(source); + source = std::move(sign_cast); + } + + // We're done + } + else if (cl_dst == dtypeclass::_float) + { + auto grade_cast = new_node(cast_kind::_grade); + grade_cast->loc = source->loc; + grade_cast->type.basetp = target_type.basetp; + // No modifiers to add + grade_cast->cat = exprcat::rval; + grade_cast->operand = std::move(source); + source = std::move(grade_cast); + } + else if (cl_dst == dtypeclass::_bool) + { + // Shouldn't happen! + chirp_unreachable("Conversion from bool to bool not needed"); + } + else if (cl_dst == dtypeclass::_ptr) + { + // TODO + diagnostic(!soft_type_checks ? diagnostic_type::location_err : diagnostic_type::location_warning) + .at(source->loc) + .reason("Cannot convert between pointer types (yet)") + .report(this->diagnostics); + goto _gen_error; + } + else if (cl_dst == dtypeclass::_func) + { + diagnostic(diagnostic_type::location_err) + .at(source->loc) + .reason("Cannot convert function types") + .report(this->diagnostics); + goto _gen_error; + } + + // ...and add it back if required + if (target_type.has_modifier_back(dtypemod::_const)) + { + auto const_cast_ = new_node(cast_kind::_const); + const_cast_->loc = source->loc; + const_cast_->type.basetp = target_type.basetp; + for (auto mod : source->type.exttp) + const_cast_->type.exttp.push_back(mod); + const_cast_->type.exttp.push_back(dtypemod::_const); + const_cast_->cat = exprcat::rval; + const_cast_->operand = std::move(source); + source = std::move(const_cast_); + } + } + // No more conversions needed + } + // int <-> float conversions + else if (cl_dst == dtypeclass::_float and cl_src == dtypeclass::_int + or cl_dst == dtypeclass::_int and cl_src == dtypeclass::_float) + { + { + // First, convert to a signed value (if int) and remove const + basic_type clean_type = source->type; + if (clean_type.has_modifier_back(dtypemod::_const)) + clean_type.exttp.pop_back(); + if (cl_src == dtypeclass::_int) + { + if (clean_type.has_modifier_back(dtypemod::_unsigned)) + clean_type.exttp.pop_back(); + if (!clean_type.has_modifier_back(dtypemod::_signed)) + clean_type.exttp.push_back(dtypemod::_signed); + } + auto clean_val = perform_implicit_conversions( + std::move(source), clean_type, exprcat::rval); + if (clean_val->cat == exprcat::error) + return clean_val; + source = std::move(clean_val); + } + // Next, perform the float conversion + { + auto float_cast = new_node(cast_kind::_float); + float_cast->loc = source->loc; + float_cast->type = target_type; + if (target_type.has_modifier_back(dtypemod::_const)) + float_cast->type.exttp.pop_back(); + if (float_cast->type.has_modifier_back(dtypemod::_signed) or + float_cast->type.has_modifier_back(dtypemod::_unsigned)) + float_cast->type.exttp.pop_back(); + float_cast->cat = exprcat::rval; + float_cast->operand = std::move(source); + source = std::move(float_cast); + } + // Now, convert to the target type + return perform_implicit_conversions( + std::move(source), target_type, target_cat); + } + // boolean conversions + else if (cl_dst == dtypeclass::_bool) + { + location_range src_loc = source->loc; + switch (cl_src) + { + case dtypeclass::_int: + case dtypeclass::_float: + { + // TODO: Add float literals + exprh arg; + { + auto lit = new_node(); + lit->value = 0; + lit->cat = exprcat::rval; + lit->type.basetp = dtypename::_int; + lit->type.exttp.push_back(dtypemod::_signed); + arg = std::move(lit); + } + + if (cl_src == dtypeclass::_float) + // Assume this doesn't fail + arg = perform_implicit_conversions( + std::move(arg), basic_type(dtypename::_float), exprcat::rval); + + auto comp = new_node(tkn_type::noteq_op, std::move(source), std::move(arg)); + comp->cat = exprcat::rval; + comp->type.basetp = dtypename::_bool; + source = std::move(comp); + break; + } + case dtypeclass::_ptr: + { + auto arg = new_node(); + arg->cat = exprcat::rval; + arg->type = source->type; + auto comp = new_node(tkn_type::noteq_op, std::move(source), std::move(arg)); + comp->cat = exprcat::rval; + comp->type.basetp = dtypename::_bool; + source = std::move(comp); + break; + } + case dtypeclass::_bool: + case dtypeclass::_none: + case dtypeclass::_func: + goto _no_conv; + } + // Construct cast node + auto cast_node = new_node(cast_kind::_bool); + cast_node->loc = src_loc; + cast_node->type = source->type; + cast_node->cat = source->cat; + cast_node->operand = std::move(source); + source = std::move(cast_node); + } + else + { + _no_conv: + diagnostic(!soft_type_checks ? diagnostic_type::location_err : diagnostic_type::location_warning) + .at(source->loc) + .reason("Cannot convert expression to expected type") + .report(this->diagnostics); + goto _gen_error; + } + + // Perform value category conversions + if (target_cat == exprcat::lval and source->cat == exprcat::rval) + { + diagnostic(diagnostic_type::location_err) + .at(source->loc) + .reason("Cannot convert an rvalue to lvalue") + .report(this->diagnostics); + + // Provide a placeholder expression for debugging purposes + _gen_error: + auto err_expr = new_node(cast_kind::_invalid); + err_expr->loc = source->loc; + err_expr->type = target_type; + err_expr->cat = exprcat::error; + err_expr->operand = std::move(source); + return err_expr; + } + else if (target_cat == exprcat::rval and source->cat == exprcat::lval) + { + if (cl_src == dtypeclass::_func) + { + diagnostic(diagnostic_type::location_err) + .at(source->loc) + .reason("Cannot convert a function lvalue to rvalue") + .report(this->diagnostics); + goto _gen_error; + } + auto cat_cast = new_node(cast_kind::_cat); + cat_cast->loc = source->loc; + cat_cast->type = target_type; + cat_cast->cat = exprcat::rval; + cat_cast->operand = std::move(source); + return cat_cast; + } + return source; +} + // Declarations void analyser::visit_var_decl(var_decl& node) { //! Point of interest - if (!sym_tracker.bind_sym(node.ident, node)) + auto* sym = sym_tracker.decl_sym(node.ident, node); + // Inherit from enclosing scope (parameter scope is private) + sym->is_global = sym_tracker.get_scope()->is_global; + sym->has_storage = true; + if (!sym_tracker.bind_sym(sym)) { diagnostic(diagnostic_type::location_err) .at(node.loc) @@ -297,9 +747,9 @@ void analyser::visit_var_decl(var_decl& node) { // Checks if it's a pointer. bool is_ptr = false; - for (std::byte d : node.type.exttp) + for (auto d : node.type.exttp) { - if (static_cast(d) == dtypemod::_ptr) + if (d == dtypemod::_ptr) { is_ptr = true; break; @@ -314,13 +764,23 @@ void analyser::visit_var_decl(var_decl& node) .report(this->diagnostics); } } - // TODO: Convert from initializer type (if any) to variable type if (node.init) + { visit_expr(*node.init); + // Convert from initializer type (if any) to variable type + if (node.init->cat != exprcat::error) + node.init = perform_implicit_conversions( + std::move(node.init), node.type, exprcat::rval); + } } void analyser::visit_entry_decl(entry_decl& node) { + auto* sym = sym_tracker.decl_sym(); + sym->has_storage = true; + sym->is_entry = true; + sym->target = &node; + node.symbol = sym; visit_compound_stmt(static_cast(*node.code)); } @@ -331,26 +791,31 @@ void analyser::visit_import_decl(import_decl& node) void analyser::visit_extern_decl(extern_decl& node) { - // TODO: Bind name + visit_decl(*node.inner_decl); + if (auto* sym = node.inner_decl->symbol) + { + node.symbol = sym; + sym->full_name.parts.clear(); + sym->full_name.parts.push_back(identifier::from(std::string(node.real_name), node.name_loc)); + } + else + { + diagnostic(diagnostic_type::location_err) + .at(node.loc) + .reason("Extern declaration doesn't declare a symbol") + .report(diagnostics); + } } void analyser::visit_namespace_decl(namespace_decl& node) { // This is even more hacky beyond any belief (pt. 1) - #if 0 - for (auto& d : node.fdecls) - { - d->ident.namespaces.insert(d->ident.namespaces.begin(), node.ident.name); - visit_func_decl(*d); - } - for (auto& d : node.fdefs) - { - d->ident.namespaces.insert(d->ident.namespaces.begin(), node.ident.name); - visit_func_def(*d); - } - #endif - sym_tracker.bind_sym(node.ident, node); - sym_tracker.push_scope(&node.ident, &node); + auto* sym = sym_tracker.decl_sym(node.ident, node); + sym->is_global = true; + sym->has_storage = false; + if (!sym_tracker.bind_sym(sym)) + return; + sym_tracker.push_scope(sym); for (auto& d : node.decls) { // Push namespace context @@ -371,7 +836,10 @@ void analyser::visit_func_decl(func_decl& node) { //! Point of interest - if (!sym_tracker.bind_sym(node.ident, node)) + auto* sym = sym_tracker.decl_sym(node.ident, node); + sym->is_global = true; + sym->has_storage = false; + if (!sym_tracker.bind_sym(sym)) { // TODO: Check if declarations match diagnostic(diagnostic_type::location_err) @@ -380,7 +848,9 @@ void analyser::visit_func_decl(func_decl& node) .report(diagnostics); } - sym_tracker.push_scope(); + auto* body_scope = sym_tracker.decl_sym(); + body_scope->is_global = false; + sym_tracker.push_scope(body_scope); visit_parameters(node.params); sym_tracker.pop_scope(); } @@ -389,13 +859,15 @@ void analyser::visit_func_def(func_def& node) { //! Point of interest - if (auto sym = sym_tracker.find_sym_cur(node.ident)) + tracker::symbol* sym; + if ((sym = sym_tracker.find_sym_cur(node.ident))) { // TODO: Check if declarations match if (sym->target->kind == decl_kind::fdecl) { // Rebind the symbol to point to definition sym->target = &node; + node.symbol = sym; } else { @@ -407,10 +879,15 @@ void analyser::visit_func_def(func_def& node) } else { - sym_tracker.push_sym_unsafe(node.ident, node); + sym = sym_tracker.decl_sym(node.ident, node); + sym->is_global = true; + sym->has_storage = true; + sym_tracker.push_sym_unsafe(sym); } - sym_tracker.push_scope(); + auto* body_scope = sym_tracker.decl_sym(); + body_scope->is_global = false; + sym_tracker.push_scope(body_scope); visit_parameters(node.params); visit_compound_stmt(*node.body); sym_tracker.pop_scope(); @@ -427,33 +904,39 @@ void analyser::visit_assign_stmt(assign_stmt& node) { //! Point of interest visit_expr(*node.target); + visit_expr(*node.value); + if (node.target->has_error()) + return; if (node.target->cat != exprcat::lval) { - diagnostic(diagnostic_type::location_err) + diagnostic(!soft_type_checks ? diagnostic_type::location_err : diagnostic_type::location_warning) .at(node.loc) .reason("Assigning to a non-object value") .report(diagnostics); } // I'll make it better, I swear else if (!node.target->type.exttp.empty() - and node.target->type.exttp.back() == static_cast(dtypemod::_const)) + and node.target->type.exttp.back() == dtypemod::_const) { - diagnostic(diagnostic_type::location_err) + diagnostic(!soft_type_checks ? diagnostic_type::location_err : diagnostic_type::location_warning) .at(node.loc) .reason("Assigning to a constant value") .report(diagnostics); } - else + else if (node.value->cat != exprcat::error) { - // TODO: Convert from expression type to variable type + // Convert from expression type to variable type + node.value = perform_implicit_conversions( + std::move(node.value), node.target->type, exprcat::rval); } - - visit_expr(*node.value); } void analyser::visit_compound_stmt(compound_stmt& node) { - sym_tracker.push_scope(); + auto* sym = sym_tracker.decl_sym(); + sym->is_global = false; + sym->has_storage = false; + sym_tracker.push_scope(sym); for (auto& s : node.body) { visit_stmt(*s); @@ -470,6 +953,11 @@ void analyser::visit_ret_stmt(ret_stmt& node) void analyser::visit_conditional_stmt(conditional_stmt& node) { visit_expr(*node.cond); + if (node.cond->cat != exprcat::error) + { + node.cond = perform_implicit_conversions( + std::move(node.cond), basic_type(dtypename::_bool), exprcat::rval); + } visit_stmt(*node.true_branch); if (node.false_branch) visit_stmt(*node.false_branch); @@ -478,6 +966,11 @@ void analyser::visit_conditional_stmt(conditional_stmt& node) void analyser::visit_iteration_stmt(iteration_stmt& node) { visit_expr(*node.cond); + if (node.cond->cat != exprcat::error) + { + node.cond = perform_implicit_conversions( + std::move(node.cond), basic_type(dtypename::_bool), exprcat::rval); + } visit_stmt(*node.loop_body); } diff --git a/src/seman/analyser.hpp b/src/seman/analyser.hpp index 3e63b75..8d9609f 100644 --- a/src/seman/analyser.hpp +++ b/src/seman/analyser.hpp @@ -9,12 +9,16 @@ class analyser { public: analyser(ast_root& root, diagnostic_manager& diag) - : root(root), sym_tracker(diag), diagnostics(diag) + : root(root), sym_tracker(diag, &root), diagnostics(diag) + {} + + void analyse(); + tracker& get_tracker() { - sym_tracker.set_root(&root); + return sym_tracker; } - void analyse(); + bool soft_type_checks = false; protected: ast_root& root; @@ -32,11 +36,16 @@ class analyser void visit_arguments(arguments&); void visit_func_call(func_call&); void visit_id_ref_expr(id_ref_expr&); - void visit_loperand(loperand&) = delete; void visit_string_literal(string_literal&); void visit_integral_literal(integral_literal&); void visit_nullptr_literal(nullptr_literal&); void visit_cast_expr(cast_expr&); + void visit_alloca_expr(alloca_expr&); + // Used to obtain a value ready for operations, by converting to rval and removing const + exprh convert_to_rvalue(exprh source); + // Converts to rvalue, then integral types -> int or long + exprh promote_value(exprh source); + exprh perform_implicit_conversions(exprh source, basic_type const& target_type, exprcat target_cat); // Declarations void visit_var_decl(var_decl&); void visit_entry_decl(entry_decl&); diff --git a/src/seman/sym_dumper.cpp b/src/seman/sym_dumper.cpp new file mode 100644 index 0000000..74f912f --- /dev/null +++ b/src/seman/sym_dumper.cpp @@ -0,0 +1,146 @@ +#include "sym_dumper.hpp" + +constexpr color c_color_decl = color::green | color::bright | color::bold; +constexpr color c_color_identifier = color::blue | color::bright; +constexpr color c_color_address = color::red | color::green | color::bright; + +static std::string indent(int x) +{ + std::string result; + result.resize(x * 3, ' '); + return result; +} + +void text_symbol_dumper::dump_symbols() +{ + dump(*tracker.get_top()); + if (dump_syms_extra) + { + std::cout << "Anonymous syms\n"; + ++depth; + for (auto& sym : tracker.table) + if (!sym->is_global) + dump(*sym); + --depth; + } +} + +void text_symbol_dumper::dump(tracker::symbol& sym) +{ + indent(depth); + std::cout << "symbol"; + if (sym.has_name) + { + if (sym.is_global) + { + std::cout << " <"; + begin_color(c_color_identifier); + int i = 0, len = sym.full_name.parts.size(); + for (auto const& id : sym.full_name.parts) + { + std::cout << id.name; + if (++i != len) + { + std::cout << "."; + } + } + end_color(); + std::cout << '>'; + } + std::cout << ' '; + begin_color(c_color_identifier); + std::cout << sym.name.name; + end_color(); + } + + // Flags + if (sym.has_name) + std::cout << " has_name"; + if (sym.is_global) + std::cout << " global"; + if (sym.has_storage) + std::cout << " has_storage"; + if (sym.is_entry) + std::cout << " entry"; + if (sym.is_scope) + std::cout << " scope"; + + if (sym.target) + { + std::cout << ' '; + begin_color(c_color_decl); + switch (sym.target->kind) + { + case decl_kind::root: + std::cout << "root"; + break; + case decl_kind::var: + std::cout << "var"; + break; + case decl_kind::entry: + std::cout << "entry"; + break; + case decl_kind::import: + std::cout << "import"; + break; + case decl_kind::nspace: + std::cout << "namespace"; + break; + case decl_kind::fdecl: + std::cout << "func_decl"; + break; + case decl_kind::fdef: + std::cout << "func_def"; + break; + case decl_kind::external: + std::cout << "extern"; + break; + } + end_color(); + std::cout << ' '; + begin_color(c_color_address); + std::cout << '[' << sym.target << ']'; + end_color(); + } + std::cout << '\n'; + // Iterate over children + if (sym.target) + { + auto const It = [this](decl& decl) { + if (decl.symbol) + return dump(*decl.symbol); + }; + decl* d = sym.target; + ++depth; + switch (d->kind) + { + case decl_kind::root: + { + for (auto& c : static_cast(d)->top_decls) + It(*c); + break; + } + case decl_kind::nspace: + { + for (auto& c : static_cast(d)->decls) + It(*c); + break; + } + case decl_kind::fdecl: + case decl_kind::fdef: + { + for (auto& c : static_cast(d)->params.body) + It(*c); + break; + } + case decl_kind::external: + It(*static_cast(d)->inner_decl); + break; + case decl_kind::var: + case decl_kind::entry: + case decl_kind::import: + break; + } + --depth; + } +} diff --git a/src/seman/sym_dumper.hpp b/src/seman/sym_dumper.hpp new file mode 100644 index 0000000..1bd7e1d --- /dev/null +++ b/src/seman/sym_dumper.hpp @@ -0,0 +1,24 @@ +/// \file Symbol table dumper for debug output + +#pragma once + +#include "tracker.hpp" +#include "../shared/text_dumper_base.hpp" + +enum class color; + +class text_symbol_dumper : public text_dumper_base { + bool dump_syms_extra; + int depth = 0; + tracker& tracker; + + public: + text_symbol_dumper(bool enable_colors, bool dump_syms_extra, ::tracker& tracker) + : text_dumper_base(enable_colors), dump_syms_extra(dump_syms_extra), tracker(tracker) + {} + + void dump_symbols(); + + private: + void dump(tracker::symbol& sym); +}; diff --git a/src/seman/tracker.cpp b/src/seman/tracker.cpp index 8dd8206..044e762 100644 --- a/src/seman/tracker.cpp +++ b/src/seman/tracker.cpp @@ -1,15 +1,33 @@ #include "tracker.hpp" -bool tracker::bind_sym(identifier const& name, decl const& target) +#include + +tracker::symbol* tracker::decl_sym() +{ + auto* sym = new symbol; + table.push_back(std::unique_ptr(sym)); + return sym; +} + +tracker::symbol* tracker::decl_sym(identifier const& name, decl& target) +{ + auto* sym = decl_sym(); + sym->has_name = !name.name.empty(); + sym->name = name; + sym->target = ⌖ + return sym; +} + +bool tracker::bind_sym(tracker::symbol* sym) { // Search for collisions, the current depth only - if (auto prev = find_sym_cur(name)) + if (auto* prev = find_sym_cur(sym->name)) { // Found a collision { diagnostic(diagnostic_type::location_err) - .at(target.loc) - .reason("Redefinition of a variable/function") + .at(sym->target->loc) + .reason("Redefinition of a symbol") .report(diagnostics); } { @@ -20,45 +38,51 @@ bool tracker::bind_sym(identifier const& name, decl const& target) } return false; } - push_sym_unsafe(name, target); + push_sym_unsafe(sym); return true; } -void tracker::push_sym_unsafe(identifier const& name, decl const& target) +void tracker::push_sym_unsafe(tracker::symbol* sym) { - tracked_sym v; - v.name = name; - v.depth = depth; - v.target = ⌖ - syms.push_back(v); + if (sym->target) + sym->target->symbol = sym; + if (sym->has_name) + { + if (sym->is_global) + sym->full_name = consolidate_path(sym->name); + else + sym->full_name.parts.push_back(sym->name); + } + syms.push_back(sym); + // FIXME: This only considers the immediately enclosing scope if (scopes.back().begin == syms.end()) --scopes.back().begin; } -tracker::tracked_sym* tracker::find_sym_cur(identifier const& name) +tracker::symbol* tracker::find_sym_cur(identifier const& name) { - for (auto it = syms.rbegin(), end = syms.rend(); it != end; ++it) + for (auto it = syms.rbegin(), end = std::reverse_iterator(scopes.back().begin); it != end; ++it) { - if (it->depth != depth) - break; - - if (it->name.name == name.name) - return &*it; + if ((**it).name.name == name.name) + return & **it; } return nullptr; } // Linear Search, inefficient. -decl const* tracker::lookup_sym(identifier const& name) const +tracker::symbol* tracker::lookup_sym(identifier const& name) { auto lastend = syms.cend(); for (auto scopeit = scopes.rbegin(), scopeend = scopes.rend(); scopeit != scopeend; ++scopeit) { + if (scopeit->begin == syms.cend()) + // Skip empty scope + continue; for (symlist::const_iterator it = scopeit->begin; it != lastend; ++it) { - if (it->name.name == name.name) + if ((**it).name.name == name.name) { - return it->target; + return *it; } } lastend = scopeit->begin; @@ -66,15 +90,15 @@ decl const* tracker::lookup_sym(identifier const& name) const return nullptr; } -decl const* tracker::lookup_sym_qual(qual_identifier const& name) const +tracker::symbol* tracker::lookup_sym_qual(qual_identifier const& name) { - decl const* top_decl; + symbol* scope; identifier const* id = &name.parts.at(0); if (name.is_global) - top_decl = top_scope; + scope = get_top(); else - top_decl = lookup_sym(*id); - if (!top_decl) + scope = lookup_sym(*id); + if (!scope) { diagnostic(diagnostic_type::location_err) .at(id->loc) @@ -85,11 +109,11 @@ decl const* tracker::lookup_sym_qual(qual_identifier const& name) const for (size_t ii = 1, size = name.parts.size(); ii < size; ++ii) { id = &name.parts[ii]; - top_decl = lookup_decl_sym(*top_decl, *id); - if (!top_decl) + scope = lookup_decl_sym(*scope->target, *id); + if (!scope) return nullptr; } - return top_decl; + return scope; } static identifier const* get_declaration_name(decl const& dec) @@ -110,7 +134,7 @@ static identifier const* get_declaration_name(decl const& dec) } } -decl const* tracker::lookup_decl_sym(decl const& decl_scope, identifier const& name) const +tracker::symbol* tracker::lookup_decl_sym(decl const& decl_scope, identifier const& name) { declh const* begin; declh const* end; @@ -134,7 +158,7 @@ decl const* tracker::lookup_decl_sym(decl const& decl_scope, identifier const& n while (begin != end) { if (auto n = get_declaration_name(**begin); name.name == n->name) - return (*begin).get(); + return (**begin).symbol; ++begin; } diagnostic(diagnostic_type::location_err) @@ -144,6 +168,15 @@ decl const* tracker::lookup_decl_sym(decl const& decl_scope, identifier const& n return nullptr; } +void tracker::push_scope(symbol* sym) +{ + ++depth; + auto& s = scopes.emplace_back(); + s.begin = syms.end(); + s.sym = sym; + sym->is_scope = true; +} + // This function is basically efficient void tracker::pop_scope() { @@ -152,11 +185,25 @@ void tracker::pop_scope() --depth; } -void tracker::push_scope(identifier const* name, decl const* target) +void tracker::gen_top_symbol() { - ++depth; - auto& s = scopes.emplace_back(); - s.begin = syms.end(); - s.scope_name = name; - s.scope_target = target; + auto* top = decl_sym(); + top->target = top_scope; + top->is_global = true; + top->has_storage = false; + push_scope(top); +} + +raw_qual_identifier tracker::consolidate_path(identifier const& name) +{ + raw_qual_identifier id; + for (auto const& scope : scopes) + { + if (scope.sym and scope.sym->has_name) + { + id.parts.push_back(scope.sym->name); + } + } + id.parts.push_back(name); + return id; } diff --git a/src/seman/tracker.hpp b/src/seman/tracker.hpp index a882d07..85a3ca2 100644 --- a/src/seman/tracker.hpp +++ b/src/seman/tracker.hpp @@ -1,11 +1,34 @@ #pragma once +#include #include #include +#include #include "../ast/ast.hpp" #include "../shared/diagnostic.hpp" +// Tracked symbol, can go out of scope +class tracker_symbol +{ + public: + //tracker_symbol* parent = nullptr; + identifier name; + raw_qual_identifier full_name; + decl* target = nullptr; + + // Flags + bool has_name : 1; + bool is_global : 1; // Is accessible in global scope? Is exportable? + bool has_storage : 1; // Does this symbol declare storage? Is a physical entity in the program? + bool is_entry : 1; + bool is_scope : 1; // Does this symbol define a scope? + + tracker_symbol() + : has_name(false), is_global(true), has_storage(false), is_entry(false), is_scope(false) + {} +}; + /* Semantics tracker, is used to get the states of variables during codegen. The tracker is also used accross files, @@ -15,58 +38,66 @@ which means it also tracks imports. class tracker { public: + using symbol = tracker_symbol; - // Tracked symbol, can go out of scope - class tracked_sym - { - public: - identifier name; - decl const* target; - unsigned int depth; - }; - - using symlist = std::list; + // List of currently availble symbols in scope + using symlist = std::list; using symiter = symlist::iterator; - struct tracked_scope + // Table of all symbols defined in the program + using symtable = std::vector>; + + struct scope { + symbol* sym; symiter begin; - identifier const* scope_name; - decl const* scope_target; }; - tracker(diagnostic_manager& diag) - : diagnostics(diag) {} - - void set_root(ast_root const* root) { - top_scope = root; + tracker(diagnostic_manager& diag, ast_root* root) + : top_scope(root), diagnostics(diag) + { + gen_top_symbol(); } + // Delare a symbol in current scope (without binding it) + symbol* decl_sym(); + symbol* decl_sym(identifier const& name, decl& target); + // Returns true if a symbol with same name in current scope DOESN'T exist // Return false otherwise - bool bind_sym(identifier const& name, decl const& target); + bool bind_sym(symbol* sym); // Binds a new symbol regardless of whether it's present in the current scope - void push_sym_unsafe(identifier const& name, decl const& target); + void push_sym_unsafe(symbol* sym); // Searches a symbol with the same name in current scope only // Returns nullptr if symbol is not found - tracked_sym* find_sym_cur(identifier const& name); + symbol* find_sym_cur(identifier const& name); // If symbol doesn't exist, returns nullptr - decl const* lookup_sym(identifier const& name) const; - decl const* lookup_sym_qual(qual_identifier const& name) const; + symbol* lookup_sym(identifier const& name); + symbol* lookup_sym_qual(qual_identifier const& name); - decl const* lookup_decl_sym(decl const& decl_scope, identifier const& name) const; + symbol* lookup_decl_sym(decl const& decl_scope, identifier const& name); // Gets deeper in the scope - void push_scope(identifier const* name = nullptr, decl const* owner = nullptr); + void push_scope(symbol* sym); // Gets out of the current scope - // and check for variabless that are deeper then + // and check for variables that are deeper then // current scope, and deletes them. void pop_scope(); + symbol* get_top() + { + return table[0].get(); + } + + symbol* get_scope() + { + return scopes.back().sym; + } + private: // Depth in the scope unsigned int depth = 0; @@ -75,8 +106,15 @@ class tracker // Keeps track of all the variables symlist syms; - std::list scopes; - ast_root const* top_scope = nullptr; + std::list scopes; + ast_root* top_scope = nullptr; + + symtable table; + + void gen_top_symbol(); + raw_qual_identifier consolidate_path(identifier const& name); diagnostic_manager& diagnostics; + + friend class text_symbol_dumper; }; diff --git a/src/shared/color.cpp b/src/shared/color.cpp new file mode 100644 index 0000000..0bdff55 --- /dev/null +++ b/src/shared/color.cpp @@ -0,0 +1,25 @@ +#include "color.hpp" + +void begin_color(std::ostream& os, color clr) +{ + #ifdef __unix__ + // Doesn't care if it's on a VT100 terminal or not + // will do coloring anyway. + os << "\033["; + unsigned int col = static_cast(clr); + if ((clr & color::bright) != color::blank) + os << (90 + (col & 7)); + else + os << (30 + (col & 7)); + if ((clr & color::bold) != color::blank) + os << ";1"; + os << 'm'; + #endif +} + +void end_color(std::ostream& os) +{ + #ifdef __unix__ + os << "\033[m"; + #endif +} diff --git a/src/shared/color.hpp b/src/shared/color.hpp new file mode 100644 index 0000000..802177b --- /dev/null +++ b/src/shared/color.hpp @@ -0,0 +1,53 @@ +/* +Command-Line coloring, currently works only on linux VT100-based terminals.. +This should be overhaul, rn this is very hacky. +*/ +#pragma once + +#include + +enum class color +{ + blank = 0, + red = 1, + green = 2, + blue = 4, + bright = 8, + bold = 0x10, +}; + +constexpr color operator|(color a, color b) +{ + return static_cast(static_cast(a) | static_cast(b)); +} + +constexpr color operator&(color a, color b) +{ + return static_cast(static_cast(a) & static_cast(b)); +} + +void begin_color(std::ostream& os, color clr); +void end_color(std::ostream& os); + +struct color_scope +{ + bool use_color; + std::ostream& os; + color_scope(bool use_color, std::ostream& os, color clr) + : use_color(use_color), os(os) + { + if (use_color) + begin_color(os, clr); + } + ~color_scope() + { + if (use_color) + end_color(os); + } +}; + +inline void print_color(std::string const& txt, bool use_color, std::ostream& os, color clr) +{ + color_scope cs(use_color, os, clr); + os << txt; +} diff --git a/src/shared/diagnostic.cpp b/src/shared/diagnostic.cpp index d4aeef1..888b699 100644 --- a/src/shared/diagnostic.cpp +++ b/src/shared/diagnostic.cpp @@ -1,5 +1,6 @@ #include "diagnostic.hpp" -#include "../color.hpp" +#include "color.hpp" +#include "system.hpp" #include "location_provider.hpp" #include #include @@ -44,36 +45,15 @@ void diagnostic_manager::show(diagnostic const& d) if (d.type.is_warning()) { - if (has_color) - { - os << apply_color("[WARNING] ", color::red | color::green | color::bright | color::bold); - } - else - { - os << "[WARNING] "; - } + print_color("[WARNING] ", has_color, os, color::red | color::green | color::bright | color::bold); } else if (d.type.is_error()) { - if (has_color) - { - os << apply_color("[ERROR] ", color::red | color::bright | color::bold); - } - else - { - os << "[ERROR] "; - } + print_color("[ERROR] ", has_color, os, color::red | color::bright | color::bold); } else if (d.type.is_note()) { - if (has_color) - { - os << apply_color("[NOTE] ", color::green | color::blue | color::bright | color::bold); - } - else - { - os << "[NOTE] "; - } + print_color("[NOTE] ", has_color, os, color::green | color::blue | color::bright | color::bold); } os << d.msg; @@ -103,14 +83,7 @@ void diagnostic_manager::show(diagnostic const& d) os << "\n | \n"; } - if (has_color) - { - os << apply_color(get_spacing(tloc.line + 1, '>'), color::red | color::green | color::bright | color::bold); - } - else - { - os << get_spacing(tloc.line + 1, '>'); - } + print_color(get_spacing(tloc.line + 1, '>'), has_color, os, color::red | color::green | color::bright | color::bold); os << replace_tabs(current_source->at(tloc.line)); os << '\n'; @@ -133,14 +106,7 @@ void diagnostic_manager::show(diagnostic const& d) else indentation += '^'; } - if (has_color) - { - os << apply_color(std::move(indentation), color::green | color::bright); - } - else - { - os << indentation; - } + print_color(indentation, has_color, os, color::green | color::bright); os << '\n'; } if (tloc.line + 1 < current_source->size() && is_important(current_source->at(tloc.line + 1))) @@ -157,3 +123,11 @@ void diagnostic_manager::show(diagnostic const& d) } } } + +#ifdef __CHIRP_UNREACHABLE_AVAILABLE +[[noreturn]]void __chirp_unreachable(char const* message) +{ + std::cerr << "\nUnreachable code has been reached: " << message << '\n'; + std::abort(); +} +#endif diff --git a/src/shared/diagnostic.hpp b/src/shared/diagnostic.hpp index 3ca89c4..a199382 100644 --- a/src/shared/diagnostic.hpp +++ b/src/shared/diagnostic.hpp @@ -5,7 +5,6 @@ #include "location.hpp" #include "location_provider.hpp" -#include "../cmd.hpp" #include #include @@ -110,6 +109,7 @@ class diagnostic_manager std::vector const* current_source = nullptr; }; -inline void diagnostic::report(diagnostic_manager& mng) && { +inline void diagnostic::report(diagnostic_manager& mng) && +{ mng.show(*this); } diff --git a/src/shared/location.hpp b/src/shared/location.hpp index 0574e45..77e1693 100644 --- a/src/shared/location.hpp +++ b/src/shared/location.hpp @@ -22,6 +22,12 @@ class location int len = 0; // Length of the token }; +// State of location printer on a specific task, like AST dump, token dump, etc. +struct location_run +{ + location const* last_loc = nullptr; +}; + // Stolen from clang struct token_location { @@ -48,3 +54,5 @@ struct location_range constexpr location_range(token_location b, token_location e) noexcept : begin(b), end(e) {} }; + +void print_loc_single(location const& loc, std::string& str, location_run* run); diff --git a/src/shared/location_provider.cpp b/src/shared/location_provider.cpp index 44d9275..d64362f 100644 --- a/src/shared/location_provider.cpp +++ b/src/shared/location_provider.cpp @@ -1,5 +1,65 @@ #include "location_provider.hpp" -#include "../color.hpp" +#include "system.hpp" +#include + +void location_provider::begin_run(location_run& run) +{ + if (_current_run) + chirp_unreachable("Location run already in progress"); + _current_run = &run; + _current_run->last_loc = nullptr; +} + +void print_loc_single(location const& loc, std::string& str, location_run* run) +{ + // There are 3 cases: + // [has_miss or fname mismatch or line invalid] + // :|invalid:|invalid + // [line mismatch or col invalid] + // :line::|invalid + // [otherwise] + // :col: + bool has_miss = !run or !run->last_loc; + + if (has_miss or loc.filename != run->last_loc->filename or loc.line == -1) + { + has_miss = true; + str += loc.filename; + str += ":"; + } + + if (loc.line == -1) + { + has_miss = true; + str += "invalid:"; + } + else if (has_miss or loc.line != run->last_loc->line or loc.start == -1) + { + if (!has_miss) + { + str += ":line:"; + has_miss = true; + } + str += std::to_string(loc.line+1); + str += ":"; + } + + if (loc.start == -1) + { + str += "invalid"; + } + else + { + if (!has_miss) + { + str += ":col:"; + } + str += std::to_string(loc.start+1); + } + + if (run) + run->last_loc = &loc; +} std::string location_provider::print_loc(location_range loc) const { @@ -7,31 +67,11 @@ std::string location_provider::print_loc(location_range loc) const auto const& loce = get_loc(loc.end); std::string result; result += "<"; - result += locb.filename; - result += ":"; - if (locb.line == -1) - result += "invalid"; - else - result += std::to_string(locb.line+1); - result += ":"; - if (locb.start == -1) - result += "invalid"; - else - result += std::to_string(locb.start+1); + print_loc_single(locb, result, _current_run); if (&locb != &loce) { result += ", "; - result += loce.filename; - result += ":"; - if (loce.line == -1) - result += "invalid"; - else - result += std::to_string(loce.line+1); - result += ":"; - if (loce.start == -1) - result += "invalid"; - else - result += std::to_string(loce.start+1); + print_loc_single(loce, result, _current_run); } result += ">"; return result; diff --git a/src/shared/location_provider.hpp b/src/shared/location_provider.hpp index f3f4071..7646ac7 100644 --- a/src/shared/location_provider.hpp +++ b/src/shared/location_provider.hpp @@ -1,11 +1,19 @@ #pragma once #include "../lexer/token.hpp" +#include "location.hpp" class location_provider { + location_run* _current_run = nullptr; + public: virtual location const& get_loc(token_location) const = 0; std::string print_loc(location_range) const; + void begin_run(location_run& run); + void end_run() + { + _current_run = nullptr; + } }; diff --git a/src/shared/system.hpp b/src/shared/system.hpp new file mode 100644 index 0000000..5f03484 --- /dev/null +++ b/src/shared/system.hpp @@ -0,0 +1,48 @@ +#pragma once + +#define CHIRP_PLATFORMID_UNKNOWN 0 +#define CHIRP_PLATFORMID_UNIX 1 +#define CHIRP_PLATFORMID_WINNT 2 +#define CHIRP_PLATFORMID_LINUX 3 +#define CHIRP_PLATFORMID_APPLE 4 +#define CHIRP_PLATFORMID_BSD 5 + +#if defined(__unix__) || defined(__APPLE_CC__) +#define CHIRP_PLATFORM_UNIX +#define CHIRP_PLATFORM CHIRP_PLATFORMID_UNIX + +#if defined(__linux) || defined(linux) || defined(__linux__) +#define CHIRP_SUBPLATFORM CHIRP_PLATFORMID_LINUX +#elif defined(__APPLE__) || defined(macintosh) || defined(__MACH__) +#define CHIRP_SUBPLATFORM CHIRP_PLATFORMID_APPLE +#elif defined(__DragonFly__) || defined(__FreeBSD) +#define CHIRP_SUBPLATFORM CHIRP_PLATFORMID_BSD +#else +#define CHIRP_SUBPLATFORM CHIRP_PLATFORMID_UNIX +#endif + +#elif defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__) +#define CHIRP_PLATFORM_WINNT +#define CHIRP_PLATFORM CHIRP_PLATFORMID_WINNT +#define CHIRP_SUBPLATFORM CHIRP_PLATFORMID_WINNT +#else +#define CHIRP_PLATFORM_UNKNOWN +#define CHIRP_PLATFORM CHIRP_PLATFORMID_UNKNOWN +#define CHIRP_SUBPLATFORM CHIRP_PLATFORMID_UNKNOWN +#endif + +#ifdef NDEBUG +#define chirp_unreachable(msg) __builtin_unreachable() +#else +#define ___STR(n) #n +#define __STR(n) ___STR(n) +#define chirp_unreachable(msg) __chirp_unreachable(__FILE__ ":" __STR(__LINE__) ": " msg) +#define __CHIRP_UNREACHABLE_AVAILABLE +[[noreturn]]extern void __chirp_unreachable(char const* message); +#endif + +#define chirp_assert(cond, msg) do \ + { \ + if (!static_cast(cond)) \ + chirp_unreachable(msg); \ + } while(false) diff --git a/src/shared/text_dumper_base.hpp b/src/shared/text_dumper_base.hpp new file mode 100644 index 0000000..8eb971d --- /dev/null +++ b/src/shared/text_dumper_base.hpp @@ -0,0 +1,33 @@ +#pragma once + +#include +#include "color.hpp" + +enum class color; + +class text_dumper_base +{ + public: + bool has_colors; + + text_dumper_base(bool use_color) + : has_colors(use_color) + {} + + protected: + void write_color(std::string const& txt, color clr) + { + ::print_color(txt, has_colors, std::cout, clr); + } + void begin_color(color clr) + { + if (has_colors) + ::begin_color(std::cout, clr); + } + void end_color() + { + if (has_colors) + ::end_color(std::cout); + } + void indent(int depth); +}; diff --git a/tests/Data Types/conv.chp b/tests/Data Types/conv.chp new file mode 100644 index 0000000..8e67bc1 --- /dev/null +++ b/tests/Data Types/conv.chp @@ -0,0 +1,19 @@ +# This file is intended to test various conversion cases +entry +{ + # Integral conversions + int: a = +5; + int: b = +'X'; + signed int: neg = -15; + float: fl = 0; + int: zero = fl; + # Boolean conversions + bool: True = true; + if True {} + if 0 {} + if 123 {} + if fl {} + if null {} + ptr const ptr int: Null = null as ptr const ptr int; + if Null {} +} diff --git a/tests/Namespace/namespace.chp b/tests/Namespace/namespace.chp index 40acf88..7094c80 100644 --- a/tests/Namespace/namespace.chp +++ b/tests/Namespace/namespace.chp @@ -11,6 +11,23 @@ namespace math } } +# Test nested namespaces +namespace a +{ + int: xxx = 5; + namespace b + { + namespace c + { + func none foo() + { + # Reference outer declaration + xxx = 6; + } + } + } +} + entry { math.add(1,1);