Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transitional changes to define a common type for nullable values (aka undefined symbols/labels) #42

Closed
wants to merge 1 commit into from

Conversation

antis81
Copy link
Contributor

@antis81 antis81 commented Jul 9, 2023

🎱 Closed -> collecting the good parts into a new (reviewable) PR #44.


For reference

TBH I don't know how this will turn out, but I think we can replace/enhance our current using Number = double; type…

One goal is to simplify bass's state logic and enable it to support more complex features (such as declarative stuff where code blocks can be placed independent of others of the same kind. One concrete example for this is the infamous "move section" use case).

The (obvious) advantage of having a "nullable" type (at runtime) is that we gain an additional state for values during parsing/serialization (std::any a; if (typeid(a) == typeid(void)) { /* yay, we have found a 'None' value */ }).

As a rule we can safely say that a label can be accessed before its definition:

negative_8bit_value = -123  ; NOTE: assignments always hold value! (not an address label)

    lda #-123
    bmi future_label
    lda #42

future_label   ; address label defined (address == PC) at this point
    ; more instructions
    rts

As discussed in issue #19, #37, #39 symbols are of constant nature and cannot be (re)defined once they hold a value. (NOTE: As the only exception to this, ACME supports the meta keyword !set negative_8bit_value = -124 to explicitly set a defined value - rarely used anyway…)

💡
Additionally I suggest adding explicit tests on our parsing rules (expecially "MetaBlock" etc.) to improve the debugging situation and find "unexpected" states of the AST more quickly. At least part of this can be done in this PR (still a hobby project…).

@antis81 antis81 marked this pull request as ready for review July 9, 2023 21:49
@antis81
Copy link
Contributor Author

antis81 commented Jul 10, 2023

Can you review with special focus on AnyValueContainer (maybe …Provider is a better name) please? While tests are all good, this enables Instructions to hold a "None" value with a known typeid… 🙂
(might be interesting that this value type can now also hold an always serializable AsmValue as long as hasValue() == true and AsmValue::valueless_by_exception() == false).

@sasq64
Copy link
Owner

sasq64 commented Jul 10, 2023

So I browsed through the code just now (on vacation so not always around my computer) -- and
the goal of handling/detecting more errors is good but I think the implementation needs another try.

Inheriting from AnyValueContainer is a bit strange, you would normally just keep it as a member.

Also in this case it seems you wouldn't even need std::any, you could just store the value as an optional<T> ?

In fact, couldn't you just replace uint32_t value with std::optional<uint32_t> value ?

@antis81
Copy link
Contributor Author

antis81 commented Jul 10, 2023

Inheriting from AnyValueContainer is a bit strange, you would normally just keep it as a member.

This is actually the implementation ("val" is a protected member of the base class - moved originally from Instruction class). Maybe the name is a little confusing… just ValueContainer or ValueProvider is better and also get/set was thought as short form of getValue/setValue). Probably I should rename those making this more prominent. Read it like "Instruction is a ValueContainer of type int32_t (and not the value itself)".

Also in this case it seems you wouldn't even need std::any, you could just store the value as an optional ?

Funny thing: All of those types can represent optional value(s). What is strange is that STL representation is not consistent. None values for those types are:

- std::any -> void
- std::variant -> nullptr (when using std::get_if<int32_t>())
- std::optional -> nullopt

(The std::optional type has a slight advantage vs std::any, that the type cannot change by assignment.)
In conclusion it actually doesn't make a big difference (even when taking space in memory into account).

Thinking this a little bit further we can use same technique on "complex" semantic values containing multiple val[1..n] members with different types (even nested) -> thinking of a ValuesContainer for those…
Example use cases are Section and Assert holding another MetaFn. Maybe I can implement a "stub" for this. Maybe some "stub" implementation outlining the concept will do for the moment…

Finally our question concerning the Number type can be answered -> std::variant<unsigned, int, double> should be good, but requires quite some refactoring work… 🚧 🙂 (I don't like changing too many things in one go here, but let's see where it goes…)

@sasq64
Copy link
Owner

sasq64 commented Jul 11, 2023

But an Instruction is not a value, it is an aggregate of an opcode, addressing mode and a value, so it should not inherit from the value representation.

So std::optional<T> as value is the way to go if supporting none values in instructions has enough benefit I think.

Changing the number type to a variant is not really feasible, type conversions would be a nightmare. Just using double covers all the cases we need since it can represent integers up to 2^48, and the math is at least consistent with how languages like JavaScript works.

(And even if Javascripts type conversions are bad, C/C++ is hardly better with its weird integer promotion rules).

I am still not sure exactly what the benefits are here. If you could add some tests that shows the errors it catches?

Then I could check that the simpler std::optional would also solve the same problem.

@antis81
Copy link
Contributor Author

antis81 commented Jul 12, 2023

My interpretation of std::variant was wrong, sorry! Let me correct that… Actually std::variant is a non-optional(!) type:

std::variant<double> vd;
std::cout << std::get<0>(vd) << '\n';   // outputs "0.0"

Generically spoken:
A variant is always initialized (either by value assignment or std::get<0>()).

Let's invalidate this… 🙂

std::variant<std::optional<double>> nullable_v1;
// or…
std::optional<std::variant<double>> nullable_v2;

@antis81
Copy link
Contributor Author

antis81 commented Jul 12, 2023

I am still not sure exactly what the benefits are here. If you could add some tests that shows the errors it catches?

Yes. Only currently I am still a little physically slowed down… 🚑

I like to generalize (streamline) our implementation logic - e.g. by using a "dictionary-like" (nullable) type as a "transitional" implementation. Just added MultiValueProvider yesterday, which hopefully gives a more complete picture:

// NOTE: This is a WIP. 
//            In current implementation T has to be of type std::variant to satisfy the interface.
//            (defaults to `MultiValueProvider<AsmValue>` aka `std::variant<int, double, std::string>`)
//            This is subject to change. (actually can be an list/set of `ValueProviders`)
class Section : public MultiValueProvider {
public:
    const std::string_view identifier;
    const std::string_view name;

    Section(std::string_view section_name)
        : identifier(persist(fmt::format("section.{}", section_name)),
        , name(persist(section_name))
    {
        reserveValue(fmt::format("{}.start", identifier));  // will initialize with std::optional<AsmValue> == nullopt
    }
    
    // members can be used for "non-serializable" values (like the above `identifier` and `name`).
    // ValueView start(fmt::format("{}.start"));  // convenience (not implemented -> ValueView is meant as a std::string_view equivalent). Effecticely results in a "binding"… :)
};

(This method also is applicable to MetaFn, Macro, etc. where we refererence labels/symbols by name…)
While it actually (heavily) tweaks C++ language semantics, using std::map<std::optional> has quite some benefits (a bit similar to Lua, Python, …).

However speaking about it… Let me make this more "independent" and move the (unfinished) concept out of the way until we can use std::optional<something> safely (-> (wip)valueprovider.h).

@antis81 antis81 changed the title Transitional changes to use std::any as an optional value type for symbols Transitional changes to use nullable value type for values (and symbols) Jul 12, 2023
@antis81 antis81 changed the title Transitional changes to use nullable value type for values (and symbols) Transitional changes to define a common type for nullable values (aka undefined symbols/labels) Jul 12, 2023
@antis81
Copy link
Contributor Author

antis81 commented Jul 13, 2023

The value provider implementation is slowly taking shape… If you like to check it out please modify this line for feature toggle. Concerning the tests running make test gives a pretty good idea.)

Currently the commmon data type is std::any (because std::optional<T> is a class template…). Also Symbol and SemanticValue already are based on std::any so it really helps with the transition.

The most prominent corner case I see is, that it is possible to accidentally re-assign the "typed" value:

ValueProvider<int> vp;
vp.setValue(42);  // just like it was a std::optional<int>
vp.getAny() = "i will produce a bad_any_cast_error";   // changes type…
std::cout << vp.getValue() << '\n';   // whooops…

However since it is type-safe I think this is acceptable and easy to find.

@sasq64
Copy link
Owner

sasq64 commented Jul 13, 2023

Concerning the tests running make test gives a pretty good idea.)

But you have not added any tests ?

@antis81
Copy link
Contributor Author

antis81 commented Jul 14, 2023

Sorry should be there now… TBH those tests are pretty trivial and MultiValueProvider is currently only "dead code" (doesn't relate to ValueProviders at all…). I still think it has potential, but the concept (unfortunately) kind of contradicts the SymbolTable (starts to feel more like a SymbolTable 2.0 🙂).

What I really like is that it provides more information about internal variable reassignments. Really can help with debugging the status quo…

💡 FYI:
Probably you know this already, still helpful in this context:
Those issues we face with "movable" sections etc. boil down to a single code line (grep -r 'return zero' 🙂). Changing this to empty opens the "pandora box"…

I also experimented a little with std::optional<T> (think we have tried this somewhere in the past, still interesting…):

template <typename T>
void SymbolTable::set_opt(std::string const& name, std::optional<T> const& opt_val) {
  if (opt_val.has_value()) {
     set(name, opt_val.value());
  } else {
    undefined.insert(name);
  }
}

Got it to run through all the passes, but it messes up PC really badly…

Are you okay with using those "transitional" functions? Because this PR is actually mergable (holds quite some goodies…). Or shall I "split" this and revert those USE_BASS_VALUEPROVIDER blocks so they don't affect any code at all?

@sasq64
Copy link
Owner

sasq64 commented Jul 16, 2023

It doesn't seem like you have read my comments... this PR adds a lot of complexity with no proven upside and I said already in the first comment that it needed do be done differently.

That's why I wanted to see some test that shows what it helps with, so I could find an alternative way of supporting it without the Provider classes.

@sasq64
Copy link
Owner

sasq64 commented Jul 16, 2023

So if it is shown that handling "unset" values like this is useful, and it actually needs a custom class, I would not expect you to inherit from it, but rather used it in place of the actual value.

@antis81
Copy link
Contributor Author

antis81 commented Jul 16, 2023

It doesn't seem like you have read my comments...

Actually I do. Sorry if it doesn't always show…
(I will "split" this PR so it becomes more "reviewable"…)

--- FYI ---

Apart from that here the underlying problem:
C++ pretty much "fails" on data serialization as there is some essential information missing in the STL specifically on class template types in the RTTI (such as std::optional vs std::optional<T>). While it is perfectly possible to compare any types by typeid(std::any) == typeid(my_any_value) (even without RTTI using std::is_same_v). The same is not available for class templates. (Means: We cannot hold a list of std::optional values… 💥). To "empower" the type and make it more similar to what std::any provides for us we would have to create our own RTTI lookup system:

std::set(typeinfo) optional_types;
bool is_optional(typeinfo a_type) { return optionals.contains(typeid(a_type)); }

template <typename T>
class Optional : public std::optional<T> {
public:
  Optional<T>() : std::optional<T>() {
    optionals.insert(typeid(T));
  }
  // … minimal implementation … 
};

Since std::any is type-safe and not a class template it actually "superseeds" std::optional in almost any aspect.

@antis81 antis81 force-pushed the dev branch 2 times, most recently from dc43a9d to 2c24463 Compare July 16, 2023 16:41
@antis81
Copy link
Contributor Author

antis81 commented Jul 16, 2023

Moved value provider into feature branch. This PR is ready for review/merge!

antis81 added a commit to antis81/bass that referenced this pull request Jul 16, 2023
NOTE: Does not change current parsing logic!

- Throw on invalid conversion of an empty std::any value to non-optional (c++) type (int,Number,string,…)
- Remove (unused) SemanticValues array value cast function
- Push the v_default AsmValue instead of creating another instance
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants