Deep dive notes #965

Baltoli · 2024-01-31T17:01:14Z

Session 1

Why 2 llvm-kompile cmake files?
Cleanup of AST library
- Document split between definition / term language - 2 parts
Document sort category design decisions - "how do I represent a term of this sort at runtime"
Numbering / discarding - consistent numbering
Layout - identify sort category of symbol's children
LLVM-specific setup - prelude etc.
- Blocks vs. headers vs. terms - write down info!
- Better place for structure representations than the LLVM header
- 10 unused bits - bytes changes
Functions emitted for each axiom
- Alphabetical order "ABI" - document!!
- Side condition (alpha-subst -> bool), apply rule (alpha-subst -> term)
- Matching - debug support, drilling down into terms to see why rules match or don't
Pattern matching - document YAML file format!
Rewrite runtime support (some of it?) from LLVM into C++
GDB uses the Sort* typedefs; no actual information carried wrt. the language
GC arenas - resources on general approach?
Arenas - why 3 separate? Generational - new memory to young space, promoted to old space - idea is that long-lived objects don't need to be scanned so frequently.
- Theo working on stackmap GC; too complex for now

Session 2

Things not documented in the pattern matching document:
- As-bindings
- Injections; pattern matching modulo triangles thereof.
  - Backend makes the assumption that we are pattern matching over terms that have collapsed their injections together. Can always assume there is at most one injection above any term.
  - Document the way that pattern matching handles injections in the Scala code (@dwightguth)
  - Similarly for overloads
  - "Least form" of a given term; code behaves "as if" terms were in this least form.
Garbage collector: generational copying collector is the term of art
Live memory identified using tracing (Cheney's algorithm)
Relationship between collection / migration / evacuation / forwarding.
Implementation details:
- Block header for an object (recall prev. session for object representation)
- Some things get allocated using malloc; long strings
- Old objects survive >=1 collection
- Layout info used in collection to properly handle children
Current big outstanding feature in the garbage collector - only run between top-level rewrite steps, not within them.
- At each step end, if we're nearly out of memory, free some memory by running the GC.
- In most cases, collecting before taking the next step means that we don't need to get more memory from the OS.
- New prototype via Dwight and Theo needs to use libunwind to get stack maps to identify garbage collection roots for a call stack.
We may at some point want to address the fact that copying collectors are not efficient for large, long-lived objects. The solution would be to either use reference counting or some kind of fragmented space that gets periodically compacted.
- Needs a motivating case for GC being a bottleneck.
GcStats mode for the CMake build produces the correct output for the analysis script. Not used at the moment but will be important if we ever change the GC again.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deep dive notes #965

Deep dive notes #965

Baltoli commented Jan 31, 2024 •

edited

Loading

Deep dive notes #965

Deep dive notes #965

Comments

Baltoli commented Jan 31, 2024 • edited Loading

Session 1

Session 2

Baltoli commented Jan 31, 2024 •

edited

Loading