Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use racket -y in racket-run? #730

Open
countvajhula opened this issue Nov 29, 2024 · 11 comments
Open

Use racket -y in racket-run? #730

countvajhula opened this issue Nov 29, 2024 · 11 comments

Comments

@countvajhula
Copy link

countvajhula commented Nov 29, 2024

Are you familiar with racket -y? I understand that it basically compiles not only the file supplied as an argument, but also any modules in the dependency chain (and just those modules and nothing extra) that have been modified since they were last compiled (the "y", according to the official docs, stands for -y, --make Yes, enable automatic update of compiled files 😄 ). I find it to be a reliable way to run any module and be sure that it reflects the actual current behavior including all upstream changes, without having to manually recompile a lot of stuff.

Would it make sense for racket-run to exhibit this kind of -y behavior by default, or potentially via a defcustom, or a separate command like racket-update-and-run? (Or is there already a way to do this? I didn't immediately see anything in the docs. edit: @soegaard on Discord kindly pointed out what appears to be the place in the source where files are run, but we didn't know what exactly this implies re: compilation of modules).

Ever since learning about racket -y, I've actually been assuming that racket-run already uses it, as it seems to take more time to run the buffer when an upstream module has been changed, but thinking again, that delay is likely due to Racket determining that the compiled version of the upstream code is stale, and falling back to using the uncompiled modules (without recompiling them). But in today's Qi meeting we noticed that modifying an upstream module and then racket-runing the downstream one wasn't actually resulting in the change being picked up. But running racket -y <module> at the command line did reflect the changes.

Using -y would likely also end up addressing the delay with racket-running modules with changed module dependencies, as it should end up recompiling those modules instead of falling back to the uncompiled ones.

@greghendershott
Copy link
Owner

Yes. Actually there's a 10 year old, double-digit issue number #41. 😄

As I mentioned in a comment there, I stumbled across https://github.com/rmculpepper/custom-load as an example of a way (or the way) to go about this. racket -y came many years later; I can also investigate how that works.

I think I never did it, so far, just because:

  • It seemed like a nice-to-have, but potentially risky wrt creating new bugs -- or at least behavior that might be confusing or annoying in ways different from the original annoyance?

  • Nobody really chimed in to ask for it -- although maybe that's because they assumed like you of course it would already do that (which is reasonable to assume).

  • AFAIK Dr Racket does not do this, and I wasn't sure (still not sure) if that's intentionally for good reasons.

    Command-line racket- y is its own process and instance of Racket.

    I'm not sure what happens when a single longer-running Racket process, as run by DrRacket or Racket Mode, starts doing this, triggered by individual racket-run commands. Can it even truly REload modules? If not, this feature might need to be weaker, e.g. you might need to racket-start-back-end to restart the whole back end Racket process. However that weaker feature might still be worthwhile?

  • p.s. FWIW Racket Mode doesn't create compiled subdirectories, or do what Dr Racket does with drracket/compiled subdirs. Not sure if that helps/hurts/NA for this.

So that's a quick brain dump (sorry) but I think that indicates you nudged my interest and I'm willing to explore this, finally, after 10 years. 😄

@greghendershott
Copy link
Owner

p.s. I had a sloppy understanding of -y, it means to (re)compile zo files:

  * -y or --make : Enables automatic generation and update of compiled ".zo"
    files for modules loaded in the initial namespace. Specifically, the
    result of (make-compilation-manager-load/use-compiled-handler) is
    installed as the compiled-load handler before other module-loading
    actions. Caution: This flag is intended for use in interactive settings;
    using it in a script is probably a bad idea, because concurrent
    invocations of the script may collide attempting to update compiled
    files, or there may be filesystem-permission issues. Using
    -c/--no-compiled cancels the effect of -y/--make.

So your feature request IIUC is to maintain compiled subdirs like Dr Racket.

Whereas the idea with #41 using custom-load was simply to avoid the "stale bytecode" problem: http://rmculpepper.github.io/custom-load/custom-load.html. One reason I liked that idea, was that people could still use whatever strategy they wanted wrt bytecode, raco pkg update, raco pkg migrate -- whatever, I wouldn't try to impose anything on them.

So I guess there are at least two distinct feature ideas to consider here? I'll keep re-loading my brain with details...

@greghendershott
Copy link
Owner

greghendershott commented Nov 30, 2024

edit: @soegaard on Discord kindly pointed out what appears to be the place in the source where files are run, but we didn't know what exactly this implies re: compilation of modules)

To clarify: The status quo is that current-load/use-compiled is set to something that caches (in memory) fully expanded syntax for each module. But for cache misses, that just defers to the previous value of current-load/use-compiled, the default. The default compiled-load handler checks for .zo files, and loads them as-is, else compiles/loads (in memory) from source. It does not create or update any .zo files. So it does not behave like racket -y or DrR.

The status quo is based on the idea that, for many projects, it's fine to compile from source. For other projects, a raco make of some files, or a raco setup --pkgs of a linked package might be appropriate. Other projects might do cross-compilation, and/or build non-Racket artifacts like shared libraries.

In any case, it's left up to the user when/how to compile, if at all. The racket-run command doesn't try to get involved with that.

And so, personally, for some projects I might have a Makefile I run from project-compile, and also use remotely for CI.

Part of me likes that division of labor, and is starting to feel conservative like I have the past 10 years. 😄 But I still have an open mind.


Another idea is that there already exists an Emacs racket-before-run-hook variable. I think this could be used to do something like project-compile, or raco setup, or whatever. If that appeals to you, but turns out to have defects in practice, we could look at remedying those?

@countvajhula
Copy link
Author

Thanks @greghendershott for sharing all that context -- I think it makes sense at a high level 😄 Having racket-run be agnostic to whether modules are compiled sounds like a good reason to retain things the way they are. re: raco setup, racket -y is much faster so I end up using it more often these days in practice. I'll explore some of these other options and report back soon on whether I'm able to get things to work to support my common/default need for the update+run behavior. I'm currently assuming that racket -y is something like / equivalent to raco make followed by racket, and if that's the case, using the before-run hook, or potentially M-x compile to do raco make before racket-running seem promising.

@greghendershott
Copy link
Owner

At one extreme, for the Racket Mode project itself, I never compile (except as part of make test locally and on CI, to catch errors, then make clean to nuke). It works fast enough, for me. But that's mainly because Racket Mode doesn't define or use sufficiently complicated macros: If compile time is fast enough, you don't notice. 😄

By contrast, for stuff like projects using Typed Racket, that compile-time type-checking can be slow enough to want to avoid redoing it, i.e. cache in zo/dep under compiled.


Actually, after a good night's sleep, I'm kind of warming up to the idea of doing both features, because they solve different problems:

  • This feature: writing to compiled (like, IIUC DrRacket and racket -y).

  • The help-with-stale-bytecode feature in Offer to help with bytecode mismatches? #41. Because, IIUC, there are can be dependencies where it's too late to re-compile. Either you load from source or fail with a bytecode mismatch error.

I think each would be harmless, and furthermore could be behind a customization flag. The default for each (on or off) idk yet; TBD.

So I'll mull this more. Meanwhile, if you do achieve some success (or failure) with racket-before-run-hook please share your experience.

@countvajhula
Copy link
Author

I didn't really understand the difference between the two options you mentioned and wanted to read a bit more about it. Having done so, I still can't say I understand anything as I'm not able to reproduce most of the cases I was hoping to 😭. But for what it's worth, this is my present understanding of the options:

First, a relevant excerpt from the docs for raco make:

In addition to a bytecode file, raco make creates a file "compiled/‹name›_‹ext›.dep" that records dependencies of the compiled module on other module files and the source file’s SHA-1 hash. Using this dependency information, a re-compilation request via raco make can consult both the source file’s timestamp/hash and the timestamps/hashes for the bytecode of imported modules. Furthermore, imported modules are themselves compiled as necessary, including updating the bytecode and dependency files for the imported modules, transitively.

This makes me think that running raco make in any situation will fix all stale compilation issues both for the present module as well as upstream ones -- which is great. And it seems racket -y can in fact be thought of as raco make; racket.

I thought to set up an experiment to try and reproduce the different scenarios that may be at play here.

Say there are two modules a and b, where b requires a, and both are compiled. Let's call a compiled module "stale" if it is older than the source module, and "fresh" otherwise.

Cases:

  1. a is fresh, b is fresh
  2. a is fresh, b is stale
  3. a is stale, b is fresh
  4. a is stale, b is stale

I'm assuming racket-run is equivalent to racket <module>.

What do we expect to happen in each of these cases when we run b?

In (1) everything works fine and there are no surprises.

On the subject of (2), the custom-load docs say this:

To be more precise, a “link: module mismatch” error arises when Racket loads a ".zo" file with an ancestor whose ".zo" file is stale or missing; the ancestor is recompiled in a way inconsistent with the descendant, but it is too late to recompile to descendant, because its bytecode is already loaded. The handlers produced by this library recursively check ancestors before loading a descendant’s ".zo" file.

So it sounds like in case (2), we might get a "link: module mismatch" error.

Although I've seen link mismatch errors in the past, I wasn't able to reproduce one in this experimental setting.

In any event, based on the docs, it sounds like this is the case that @rmculpepper's custom-load would help with.

For case (3), based on the above docs, it doesn't sound like custom-load would detect this. If that's correct, we could expect the following scenario: if a user keeps making modifications to a, they may run b and it wouldn't give any warning about the stale compilation issue, because b is fresh, and is up to date with the compiled version of a as well. Instead, the behavior just wouldn't reflect their changes and they may keep poking at things until they finally realize the problem. At least for me, this seems to happen fairly often 😅

I was thankfully able to reproduce this case in the a, b experimental setup, and running b doesn't reflect changes in a, though it uses the older, compiled version of a without complaint (masking the problem).

It sounds like adding support for compiled folders / racket -y would address this case.

In (4), it seems like both custom-load as well as racket -y would detect this case.

In sum, my impression is that supporting compiled folders / racket -y behavior would address all 4 cases when modules are compiled. custom-load handles fewer cases, but it does not expect that modules must be compiled, so it makes sense for racket-run to use it and remain agnostic to compilation (and case (3) would not be handled), while providing a parallel feature to do racket -y. Does all this match how you're thinking about it?

@countvajhula
Copy link
Author

re: using Emacs's compile, I tried this:

(defun my-racket-compile ()
  "Compile this Racket module."
  (interactive)
  (when (buffer-file-name)
    (let ((compile-command (format "raco make %s" (buffer-file-name))))
      (recompile))))

I think that works?

Adding it to racket-before-run-hook didn't work too well since the compilation window pops up and obscures the REPL. The docs for compile say this:

Compile the program including the current buffer.  Default: run ‘make’.
Runs COMMAND, a shell command, in a separate process asynchronously
with output going to the buffer ‘*compilation*’.

... which makes me wonder whether the module could end up being racket-run before the compilation has finished.

But doing it in two steps (i.e. my-racket-compile followed by racket-run) seems safe. The only thing is, compulsively running these two steps may not be the smoothest UX.

@greghendershott
Copy link
Owner

I'm assuming racket-run is equivalent to racket <module>.

Well, it's not really equivalent.

The closest equivalent isn't racket <module>, it's more like the Run command in DrR.

Each Racket Mode back end (normally there's one) consists of racket running the Racket Mode back end code, which supports zero or more REPLs. More words and pictures about this.

Each racket-run is doing module->namespace to give you a REPL "inside" that module. (There are a bunch of other details, including each run happening inside a fresh custodian, so that it can be killed and associated resources cleaned up.)

@greghendershott
Copy link
Owner

greghendershott commented Dec 3, 2024

But doing it in two steps (i.e. my-racket-compile followed by racket-run) seems safe. The only thing is, compulsively running these two steps may not be the smoothest UX.

I understand it's annoying. OTOH the UX reflects the reality that compiling to zos might take a long time? That was a weakness of my suggestion to use racket-before-run-hook, I guess.

I think one thing to keep in mind is that, IIUC, Racket always compiles if there is no zo or an outdated one (but it won't necessarily produce a new zo). It should always work correctly1 ... eventually. The question here is, can it cache previous compilations in zo files? For some projects that's N/A, for others it's nice-to-have, and for others (with non-trivial compile-times) must-have.

Footnotes

  1. But do you have examples where it doesn't work for you (it's not just slow, it's wrong)? It sounded like that was the original motivation for this. I'm not clear if you were getting old zo instead of newer source (you should get the latter automatically, albeit more slowly), and if so, why. That by itself could be a bug to fix, vs. a new feature to add.

@countvajhula
Copy link
Author

I think one thing to keep in mind is that, IIUC, Racket always compiles if there is no zo or an outdated one (but it won't necessarily produce a new zo). It should always work correctly1 ... eventually. The question here is, can it cache previous compilations in zo files? For some projects that's N/A, for others it's nice-to-have, and for others (with non-trivial compile-times) must-have.

Yes, that sounds right to me. One thing that has confused me in the past is that uncompiled modules take way longer to run, and that has sometimes inflated benchmarks, like these for the relation package. But upon reflection now, that's surely because it includes the additional time necessary to load and compile the module, and the benchmark doesn't factor that out. So yes -- this makes sense. All modules are compiled every time, just not necessarily cached.

In that case, it would seem that if the downstream module is ever stale, it will always behave correctly when run, since it will be recompiled, and transitively compile dependencies.

But do you have examples where it doesn't work for you (it's not just slow, it's wrong)? It sounded like that was the original motivation for this. I'm not clear if you were getting old zo instead of newer source (you should get the latter automatically, albeit more slowly), and if so, why. That by itself could be a bug to fix, vs. a new feature to add.

I would say case (3) above. Upstream module is modified but not recompiled. Downstream module continues to use the stale zo file since the zo files are mutually consistent. I find this is a common case (e.g. change source code and then re-run tests. These days I use racket -y to run tests).

the UX reflects the reality that compiling to zos might take a long time? That was a weakness of my suggestion to use racket-before-run-hook, I guess.

In most cases I encounter in practice it usually takes seconds, so not too long. But as we would be spending this time in any case with racket-run (since it always compiles), it would be ideal to update zo files if they are present at this point, since otherwise we have to pay this cost each time we racket-run, or at least (the way I usually do it) recompile at the command line (or via Emacs recompile) first before racket-running the next time.

That is, assuming a module is stale and already has a .zo file, it seems like it is always a good idea to update it on first compilation instead of recompiling without updating the cache. But that still doesn't necessarily include case (3) as racket-run will happily run the module's compiled version if it isn't stale, and yet, would not reflect changes in stale dependencies.

@greghendershott
Copy link
Owner

greghendershott commented Dec 5, 2024

Thanks for the summary. On first read that all sounds good.

Also:

  • I think the "link mismatch" scenario can still arise, so doing something like Offer to help with bytecode mismatches? #41 to ignore the zo might still be beneficial in some cases.

  • DrRacket does not write zos to the compiled directories used by raco make -- instead it writes to sub-directories e.g. compiled/drracket and compiled/drracket/errortrace. It's unclear to me if it's safe for Racket Mode to reuse DrR's subdirs, or, it needs to add even more subdirs. (Five total caches seems gross, but maybe necessary?)

  • Speaking of that errortrace subdir, that's another wrinkle. If you've set the Emacs var racket-error-context to 'high, then it's using errortrace... which means rewriting your whole program as "errortrace-able"... which shouldn't be dumped in the same cache dir.

  • racket-xp-mode uses drracket/check-syntax, which needs fully-expanded code. Unless errortrace is desired, probably that expansion could/should be reused for a racket-run. A comment in DrR source suggests it does this. (But I'm not confident whether a zo should be written?)

So a fair number of things to think through and implement correctly. Good news is there's precedent/examples from DrR for nearly all of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants