-
Notifications
You must be signed in to change notification settings - Fork 10
Significantly improve performance, reliability and memory usage #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
7e9d37d
to
3b4d55c
Compare
Oh, for posterity: this has known issues with
If the namespace is already present on the classpath, calling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first pass complete, 4 files left to review, will finish them later today
(do | ||
(doseq [p classpath] | ||
(add-url classloader p)) | ||
classloader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a risk that this will cause some things to successfully compile even though they depend on something no longer in the classpath?
oh yeah I finished this yesterday but didn't have any more comments to add :) |
96abf51
to
f25ed9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking much cleaner now! very nice
src/rules_clojure/compile.clj
Outdated
;; files. If we reload after compiling, that breaks some OSS code | ||
;; doing janky things. Therefore, cache every compile request, and if | ||
;; bazel asks for the same ns again, just copy the files. We | ||
;; fingerprint namespaces using the SHA of the file contents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
src/rules_clojure/compile.clj
Outdated
|
||
(defn agent-compile [all-ns-decls dest-dir ns] | ||
((ensure-agent ns) | ||
(fn [_] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we're never using the agent state, we're using it to do to 3 things:
- run thunks one at a time per ns
- run thunks in different nses in parallel
- wait for those to be done
Could we:
- create a lock per ns
- require and compile would be
(future (locking ns-lock (do-stuff)))
- deref the returned futures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ lock + future works if we don't care about ordering of the submitted requests, which maybe we don't (because of sha?). But if we do, I think a better idea (also mentioned in Slack) is to create a (Executors/newSingleThreadExecutor)
per ns and then (.submit ns-exec ^Runnable (bound-fn [] ...))
. that will return a future that you can deref to await
9731425
to
9e6cc59
Compare
src_dir = None | ||
|
||
compile_classpath = compile_info.transitive_runtime_jars.to_list() + ctx.files.compiledeps + [classes_dir] | ||
compile_classpath = compile_info.transitive_runtime_jars.to_list() + ctx.files.compiledeps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Classes dir is no longer a declared output of the build, which means:
- downstream targets can't read from it
- we've moved classes dir to a per-request temp dir, which means other rules-clojure processes can't read it either (fixes a race condition where a second rules clojure process would read out the classes dir from the other process, because it was on the classpath)
"worker.clj", | ||
"parse.clj", | ||
"gen_build.clj", | ||
"namespace/parse.cljc", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vendor tools.reader, java.classpath, and finish vendoring tools.namespace. This is to avoid causing incompatibility with user code, and adds a few opportunities for caching.
"@rules_clojure_maven_deps//:org_clojure_data_json", | ||
"@rules_clojure_maven_deps//:org_clojure_java_classpath", | ||
"@rules_clojure_maven_deps//:org_clojure_tools_namespace"]) | ||
"@rules_clojure_maven_deps//:org_clojure_tools_deps_alpha"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to vendor tools.deps.alpha because it's used by gen-build, not rules-clojure.compile
outs=["libcompile.jar"]) | ||
|
||
genrule( | ||
name="bootstrap-libfs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started being more strict about never reloading code, which caused problems compiling rules-clojure code using the rules clojure compiler, so bootstrap more of it.
(filter identity) | ||
(apply merge))) | ||
|
||
(defn ->lib->jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused code
[clojure.tools.namespace.find :as find] | ||
[rules-clojure.fs :as fs] | ||
[rules-clojure.parse :as parse]) | ||
[rules-clojure.namespace.find :as find] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor tweaks in gen-build to accommodate vendoring tools.namespace. Also drive-by delete some unused code
c45708e
to
d0eb1ea
Compare
(assert (loaded? ns) (print-str ns "should have already been compiled")) | ||
(copy-classes (fs/->path cache-dir) (fs/->path dest-dir)))) | ||
|
||
(def no-aot '#{;; specs are too big to AOT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This moved from banksy's deps.edn because any require on the source tree can trigger it. The proper way to fix is to move it into a bazel toolchain, but that's a lot more work
fce9c70
to
7c5bc2f
Compare
82e1a93
to
09dcc95
Compare
Significantly improve compile time, CPU usage and memory usage. Significant rework of internals.
The previously observed memory "leak" was not a leak. To avoid race conditions and to get parallelism, we were creating a classloader per thread and compiling in every thread. This wasn't leaking memory, just using more memory than a dev laptop has available. Wall clock time was also poor because reloading the same classes once per core (8-10x) is inefficient.
We instead use a single classloader shared by all threads. We avoid the previous need to discard a classloader (causing more wasted work!) by instead reloading downstream dependencies, the same way a human would at a REPL. Parallelism is handled via agents. We create an agent for each namespace in the project, and
send
the commands to it. Before compiling a namespace, we sendrequire
to all of its dependents. If the namespace contains deftype, definterface or defprotocol, we sendreload
to all downstream dependencies, to avoid AOT hell.