-
Notifications
You must be signed in to change notification settings - Fork 0
/
Modsplan To Do List.txt
148 lines (95 loc) · 4.22 KB
/
Modsplan To Do List.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
Modsplan To Do List
Applications
Data parsing
Weather data?
Organize as library, with documentation
Domain-specific languages
Config, makefiles?
Could a version of this aid in translating natural language?
Uncategorized
Modify all to parse unicode text?
All files should be unicode, because all may contain comments
With some restrictions, like only tabs or spaces for indenting
How much work is this?
BBEdit 'Run in Terminal' closes output window immediately when exception occurs --
can we fix this or workaround, to get traceback in output?
Catch and "print traceback.format_exc()" if necessary.
Use logging module for debug output
Add command line option to save debug output to a file
Testing
Add tests to verify that various syntax errors are caught
Compiler
Option to show source line numbers in emitted code
Translate SBIL to LLVM assembler
Use type flags to handle operator overloading
Symbol table, handle and check types
Defn specs
Use (sub)directory per output language for .defn specs?
These share common (higher level) .defn specs with others (imported with "use")
Need to allow for selecting .defn dir different from grammar dir
Make clear when to use expansion vs substitution in defn specs. ###
Expansion generates multiple instructions. (Always?)
Substitution (a defn "word") joins resulting strings with spaces.
Add type flags to signatures, constraining matches.
Parsetree
Use xml.etree.ElementTree instead of parsetree.py
Output tree to XML file
line, col are attributes
token text in attribute, or content of tag?
(probably content)
Grammar
Ignore '#' inside quotes (even if preceded by space): not a comment.
Make first (non-imported) nonterm in file the default root. ###
Syntax parser
Parse trace
Simplify code for tracing output and maxtokens tracking
For next token display, indicate when backing up.
Test syntax error reporting.
Research how others do this?
Rewrite inprefixes(), etc. so it works with uppercase keywords.
How distinguish (in prefixes) uppercase keywords from kindnames?
Store a quote char in the prefix?
Store two prefix sets, one for kindnames, one for literals?
This will be more efficient in inprefixes().
List KEYWORD tokens in .tokens grammar?
Then look at token name to decide what to compare to prefixes.
This looks easiest to code,
but requires keeping .tokens up to date with .syntax.
Should alphabetic literals be automatically recognized as keywords?
Would keywords have to appear before NAME in .tokens?
But tokenizer classes all letters L or U,
so keywords will never be recognized!
This is a problem also for boolean operators.
First look for token text in prefixes, then for token name?
See notes 2011-06-08.
estimate: 3 hours
Compare syntax parser to tokenizer, examine common algorithms.
What are differences? Are these bugs?
For example, should parse_item() be able to return a zero-token match?
Can simplifications be made?
Should common code be factored out?
estimate: 2 hours
Tokenizer
Recognize multi-line comments. (May have to do this in syntax parser.)
Lineparsers
DONE:
[See git repository for changes since 2012-12-27]
2013-02-15 Registered modsplan.com
Implemented command line invocation in syntax.py
DONE 2013-01-02
Fixed handling of + quantifier in parse_alt().
DONE 2012-12-27
Don't add literals (keywords and punctuation) to syntax tree
Not needed for code generation?
DONE 2011
Rewrote Item to always store a string.
At least give it methods to hide implementation, and do not access
its attributes externally (if not too difficult).
Need this to handle uppercase keywords properly.
(Keywords are literals, which will include quotes.)
[See notes 2011-06-01..02]
estimate: 2 hours
DONE 2011-06-15, 3 hours
Use '' (empty string) or None for name of unclassified tokens,
(instead of SYMBOL), in case users use SYMBOL for a tokenkindname.
DONE 2011-06-10, 0.5 hours