-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack vs. register based VMs #8
Comments
Another question is whether the compiled bytecode (or word-code, etc) is intended for distribution or just for running in memory. There are different trade-offs. Stack-based is more compact for distribution. Also see WASM for a "structured stack machine" variant designed for JITting. |
I'd be interesting how variable width register bytecode would fare in that regard and if there'd be a performance penalty for that approach. |
I think we must not exclude another important technique for writing VM's, that is: AST (or similar) based. I say this technique is important, mainly because many "young" or "simple" implementations choose to use this approach, especially since it is trivial and requires almost no additional effort or knowledge. For example when I investigated Scheme/Lisp implementations is Rust, almost two thirds were AST-based (I counted all independent on "maturity"). Under this AST-based VM category I think there are two main approaches, mainly based on their "efficiency":
For example in my Scheme implementation ( https://github.com/volution/vonuvoli-scheme ) I took the second, expression-based evaluator, mainly from two reasons:
The expression structures can be seen at the following link: On a second thought, perhaps there is a forth kind of VM's: the "transpiled" ones, i.e. the hosted language generates Rust code which is the compiled and executed; (or by extension the hosted language can be "transpiled" to any other hosted language and run;) Although this forth category isn't actually a VM, my take on this thread is that it tries to see "how" one can actually evaluate the hosted code. |
I excluded them because i was only thinking of what you called the first approach. Literal AST walking is just too slow and cumbersome since a parser AST is not made to be executed. The second approach seems to be a lot more interesting. Imo, the lines to actual bytecode are quite blurry. The more optimized and low level the expressions based form becomes, the more i'll look like bytecode. As i see it it's more about the construction of the intermediate form than an inherent difference in abstraction. De- and encoding expressions/operations by hand is probably more compact and there is more control, but it's clear that using standard rust enums and structs is simpler and safer. The real performance difference remains to be seen. Also, if the hot paths are jitted anyway, the baseline interpreter performance may become less important. |
Initially (when I started implementing my interpreter) I agreed with your definition of bytecode, i.e. anything that is "constructed" from the AST, but which is not native code. However, after a few months I changed my opinion, and consider this "expression" based interpreters a middle-ground between AST and bytecode, neither one nor the other. Definitively better than AST, but clearly less efficient than bytecode.
I have compared my interpreter performance with Chibi-Scheme, which is C-based and does feature an actual bytecode VM, and the difference is clear: their bytecode-VM is a few times faster than my expression-VM for "CPU-intensive expressions" (i.e. arithmetic, looping, etc.). (Looking at their code, except the VM, I would say the overhead should be similar, therefore I would consider this a "fair" comparison.)
Indeed I agree with what you have pointed above, because when I am comparing "real-life" Scheme code (like for example re-implementing a simple Therefore my take for anyone wanting to implement hosted languages is: instead of investing a lot of time to design and implement a bytecode-VM (either stack or register based), one could use that effort and implement more efficient and extensive builtin functionalities (i.e. utility functions and syntaxes) directly in Rust, so that the majority of the CPU time is spent in doing actual "work" instead of "interpreting" the bytecode. (Because no matter how efficient a VM is, it can't beat native code.) :) |
These are the two prevalent implementation techniques for virtual machines. I'd like to collect some resources about the trade offs.
On a high level:
Examples:
Stack based VMs
Register Based VM
Alternatives:
There are Belt Machines. I don't of any practical VMs that uses it, i came across the concept via the MILL CPU architecture. May be worth evaluating.
The text was updated successfully, but these errors were encountered: