Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Token capture #481

Open
tavmem opened this issue Aug 25, 2017 · 15 comments
Open

Token capture #481

tavmem opened this issue Aug 25, 2017 · 15 comments

Comments

@tavmem
Copy link
Collaborator

tavmem commented Aug 25, 2017

in kona

$ rlwrap kona/k
K Console - Enter \ for help
  a:b
value error
b
^
  \v
,`a          /`a got added to K-Tree
  _n ~ a
1

in k3.2

a:b
value error
b
^
\v
             /empty K-Tree
@tavmem tavmem added the bug label Aug 25, 2017
@tavmem
Copy link
Collaborator Author

tavmem commented Aug 25, 2017

On the face of it, this one seems complex.
Looks like it requires reversing the parse order of tokens (to be consistent with left-of-right evaluation).

@tavmem
Copy link
Collaborator Author

tavmem commented Aug 27, 2017

The current parse order was part of the origninal design ... it's not a bug
Changing the label to "enhancement".

@tavmem tavmem added enhancement and removed bug labels Aug 27, 2017
@tavmem tavmem changed the title Erroneous posting to K-Tree Posting to K-Tree Aug 27, 2017
@tavmem
Copy link
Collaborator Author

tavmem commented Aug 29, 2017

It's not simply a reversal of direction.
A line is parsed (which may contain multiple statements).
Once the line is tokenized, the number of statements (and their bounds) can be determined.
Then, statements should be processed left-to-right.
But, within a statement, token-capture should be done right-to-left.

@tavmem tavmem changed the title Posting to K-Tree Token capture Aug 29, 2017
@tavmem
Copy link
Collaborator Author

tavmem commented Aug 29, 2017

Another example of the inconsistency of the current process used by Kona (left-to-right token capture, with some special-case adjustments):

$ rlwrap ~/kona/k
K Console - Enter \ for help
  c:1;a:b
value error
b       /Correctly flagged the error
^
  \v
`c `a   /But did a token-capture on `a
$ rlwrap ~/kona/k
K Console - Enter \ for help
  c:1;a:b;d:2    /Did not flag the error
  \v
`c `a `b `d      /Did a token-capture on both `a and `b.
                 /Should not proceed to parse the 3rd statement.

@bakul
Copy link
Contributor

bakul commented Aug 29, 2017

In k/kona parsing is done left to right. Evaluation right to left. The problem here is in error handing, not parsing or evaluation. This can be seen by e.g. comparing how a[b]:c is handled by k3 vs kona. If none of three variables exist, in k3, you get a value/parse error pointing to c. If c exists, you get a value/parse error pointing to b. If b exists, you get a type error pointing to a. kona produces no error and creates null variables a b c.

Conceptually what happens is that parsing creates a parse tree for each statement. During evaluation if there is any error, evaluation of the rest of the tree is abandoned. This is what happens in k3 but not in kona. As can be seen by a:b;c:1. k3 abandons the whole line, while kona doesn't and gives c a value of 1. k3's behavior is more useful because in general you don't want to continue a computation after an error.

Now consider what happens with f:{a[b]:c}. In this case k3's \v will show f a b c are defined. So k3 handles the same expression at top level differently from when it is in a function. In case of function it doesn't evaluate its body but notes all the variables referenced by it and enters them in K-tree if not already present. This can be a useful behavior in that f can be defined before a b c are defined. While an expression like a[b]:c at the top level, when none of a b c are defined is definitely an error.

I think that in k3, evaluating f[] should result in error and the fact that it doesn't is an artifact of how functions are implemented in k3. In particular it should be possible to represent uninitialized variables. Instead k3 simply uses null. In my view kona extends this less desirable behavior in the name of consistency.

Recommendation: fix the top level behavior to match k3. Second, fix behavior after an error. For functions it would be nice if evaluation of functions such as f above results in error but that is probably a bigger change.

@tavmem
Copy link
Collaborator Author

tavmem commented Aug 30, 2017

First of all ... thanks for your comment.

I agree that (currently) kona parses left to right and evaluaties right to left, and that kona has a problem in error handling. Some errors can be caught while parsing, and some errors can only be caught
later, in evaluation.

Currently, in kona, it seems that the parse process has multiple steps. First, a line is converted to tokens. Then, a parse tree is created for the whole line (not a separate parse tree for each statement). While creating the parse tree, kona "captures" each token in order to include it in the parse tree. On a first look, it appears to me that more errors could be identified during parse-tree creation if token-capture (within a statement) was done right to left. (This theory may or may not work out.)

At the top level, kona (just like k3) treats statements within a function differently from normal statements. For your example, f:{a[b]:c} kona only defnies f. Kona does not add `a `b and `c to the K-TREE, which may be an improvement over k3. However, consistent with k3, f[] also yields null.

@tavmem
Copy link
Collaborator Author

tavmem commented Aug 31, 2017

It's probably useful to get A+ working on Fedora, and examine the parsing and error handling used there.

@bakul
Copy link
Contributor

bakul commented Sep 1, 2017

Right about kona and f:{a[b]:c}. I accidentally started up k instead of kona : )

Let me try again. K3 captures a parse of one or more complete top level expressions in one or more lines before evaulating them. For example

a;(2
   b:3);b

This will produce three top leve parse trees (or one, if you consider "sequence" as a node). Now if a is undefined, the rest of the sequence must not be evaluated and b is not entered in var list. k3 does this. kona doesn't. Instead kona happily continues, sets b to 3 and then reports 3 as a result. In addition \v shows both a and b. This tells me there are two problem: 1. with error handling in, and 2. with entering previously undefined symbols in K-tree even on an error.

@tavmem
Copy link
Collaborator Author

tavmem commented Sep 1, 2017

The case you identified can be fixed with an adjustment to the "capture" part of the parse process.
It simplifies as a;b:3
I will open it as a separate issue and fix it. (Issue 482)

The case which is causing me to wonder whether the "capture" order (within a statement) in the parse process should be right-to-left is a:b where both a and b are undefined.

@tavmem
Copy link
Collaborator Author

tavmem commented Sep 1, 2017

Meanwhile, I was able to get A+ (from www.aplusdev.org) to compile and install on Fedora-26.
It surprised me that I needed to modify 11 source files to get A+ to compile.
I will post the modifications in a separate repository on github.

I also got XEmacs running on Fedora-26 (which is the recommended environment for running A+).

I'm currently having a problem getting the APL fonts to work. Hope to resolve that this weekend.

Then, maybe, I can find out what A+ does in the case that is equivalent to the kona case of a:b where both `aand `b are undefined. Of course, there may not be a structure equivalent to the K-TREE in A+.

@bakul
Copy link
Contributor

bakul commented Sep 1, 2017

Re: token capture order. This is how I view the process: parsing goes from left to right but nothing is evaluated during parsing. Not even symbol lookup. Thus for example, given a:b; c:3, you get a parse tree that can be conceptually represented like this:

(sequence (amend verb-: symbol-a symbol-b) (amend verb-: symbol-c int-3))

Here symbol-a denotes a single object whose type is symbol. Similarly for verb-: and int-3. Now when you evaluate a sequence, evaluation proceeds from left to right.

The first sub-tree is (amend verb-: symbol-a symbol-b). Here you evaluate the slot that hassymbol-b first. If this fails, you don't even look at symbol-a. The amend fails and as a result the rest of the top level sequence is abandoned.

Note that amend is different than apply in how the left argument is evaluated.

@tavmem
Copy link
Collaborator Author

tavmem commented Sep 2, 2017

Your view is of how the process should work ... and that view make sense.
The problem at hand is that when kona parses the statement a:b (where `a and `b are undefined) kona adds `a to the K-TREE well before evaluation begins.
At this point, I'm looking to fix the problem (i.e., stop kona from adding `a to the K-TREE during the parse process), not rewrite kona.

@tavmem
Copy link
Collaborator Author

tavmem commented Sep 2, 2017

However, your view is sort of making my point.
The problem in kona is that symbol-a is "captured" first (and added to the K-TREE since symbol-a is the target of an amend).
If symbol-b is "captured" first, an error is thrown, and the parse process is abandoned.
It still looks to me that the token-capture process needs to go from right-to-left (within a statement).
The key to a fix is to process symbol-b first.

@tavmem
Copy link
Collaborator Author

tavmem commented Sep 3, 2017

Got the APL fonts to work in Mozilla Firefox, but not yet in XEmacs.

@tavmem tavmem mentioned this issue Mar 18, 2018
@tavmem
Copy link
Collaborator Author

tavmem commented Mar 27, 2018

An even simpler example of the problem:

$ rlwrap -n ~/kona/k
kona      \ for help. \\ to exit.

  a
value error
a
^
  \v
  a:a
  \v
,`a
  a~_n
1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants