-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Watchman-backed DiffAwareness
implementation
#22615
base: master
Are you sure you want to change the base?
Conversation
@haxorz Based on #22417 (comment), it seems necessary to me to outsource file system watching in Bazel to a maintained and standard alternative. This could be a first step towards this. I haven't added tests yet since that requires adding |
CC @sluongng |
Some context on Watchman:
Although the development of Watchman has been relatively quiet since Meta has been transitioning to EdenFS (a FUSE for their Sapling VCS), it's still the most robust file-watching solution to date with multiple strategies and fallbacks and is still actively used in newer projects such as Buck2. So I think this is a good addition to Bazel. (0): https://github.com/git/git/commits/master/templates/hooks--fsmonitor-watchman.sample |
src/main/java/com/google/devtools/build/lib/skyframe/WatchmanDiffAwareness.java
Show resolved
Hide resolved
907d653
to
abd53c2
Compare
I pushed a new commit that adds Windows support. I will not make any further changes for now. |
I would like to see more user feedback whether a solution that requires installing watchman is acceptable. Tagging a few users here, but we might also ask on bazel-discuss or slack. cc @konste (due to #1931 (comment)) cc @keith (due to #13226 (comment)) cc @sushain97 (due to #18363) @fmeum I assume addressing #5270 would be theoretically possible with watchman? |
@meisterT It would, although I expect the work for that to be mostly orthogonal to the |
Does Watchman support Directory Junctions on Windows? It is NOT supposed to follow Directory Junctions in the same way as it skips symlinks .It is important for the case when --symlink_prefix is used and points to the directory (like.build). Those "convenience symlinks" in that directory are Junctions not the actual directory symlinks. We must not follow those. Another question is how Watchman supposed to get to the build machines? Will Bazel install it as part of its own unpacking? |
Watchman doesn't follow junctions, just like it doesn't follow symlinks.
As far as I understand, watchman spawns a server process per user similar to what Bazel does. I do not know what happens when you launch different watchman binaries though (@sluongng Do you happen to know this?). It may be necessary to have a single watchman binary for Bazel and all other clients, which would make bundling less attractive. |
If |
AFAIK, the current implementation makes So there is no need for tight coupling when doing If you are packaging everything in-house today, I would treat this the same way as how |
watchman has the same client-server architecture as Bazel. You should not need to launch a second server since one server can watch multiple file paths. But there are ways to configure multiple servers, just like Bazel with different output paths. |
I was hoping that after some migration / testing period we could remove the current implementations, at least on Mac/Windows so that our maintenance surface in an area that is not well covered already is not increased. |
Except it doesn't quite work on Windows now specifically because of the problem with --symlink_prefix which current implementation does not handle and wanders into Bazel output tree. |
Bundling would also increase Bazel binary size, by 10MB or something in that ballpark. |
@konste Could you please describe how you tested this on Windows? AFAIK, both watchman and Java NIO WindowsWatchService use Windows API Comparing the 2 implementations, |
I'm supportive of this, i imagine watchman would sidestep the issues we saw (I think since I posted that comment it was suggested somewhere that it could have been because too many files changed at once) |
👋 We (Stripe) would definitely try this feature. We already use |
I am a bit confused with your particular issue. Changes to Secondly, Watchman does come with its own config file, in which you could exclude directories from being watched (b). This is applicable on the watchman level, so events from If you still need (a): 0cdd71f#commitcomment-44572088 |
@sluongng let me provide some comments:
This is very nice to hear and practically speaking it sufficient to solve the problem with Now there is another important question - what is the relationship between the set of folders configured for Watchman and the set of folders Bazel currently scans using its own rules. What would happen if some folder supposed to be scanned by Bazel but excluded in Watchman config? Will it still be scanned by Bazel (but not watched)? We get two sources of truth here and it can be taken for granted that early or later they diverge. We need to make one of them to follow the other or otherwise a mechanism to discover discrepancies in those folder sets and alert the user. .bazelignore file only adds to the confusion as it is potentially the third source of truth. What do you think would be the right solution here? And by the way thanks a lot for looking into it! Very much appreciated. |
In my personal opinion, this is not something that could be fixed. It's an inherited downside of using an external file-watching service instead of building it inside Bazel code base. We could provide some guardrails in the future to help detect and warn users about the diff between |
Not necessarily. Watchman must have some kind of API to be controlled by the clients like Bazel, right? Using that API we should be able to exclude a folder automatically based on it being the value of --symlink_prefix flag or alternatively it can be auto-skipped because it only contains directory junctions (AKA convenience symlinks). But let me repeat my question to clarify the current design. What would happen if some folder supposed to be scanned by Bazel but excluded in Watchman config? Will it still be scanned by Bazel (but not watched)? |
With the current state of the implementation, such a folder would indeed be seen by Bazel but changes to it wouldn't be tracked. If we decide to go forward with this implementation, we could have Bazel read the list of directories ignored by Watchman and emit a warning or fail if it doesn't ignore all of them.
Yes and no. There are two ways to ignore directories with watchman: the |
THIS is extremely important! Whatever design you end up implementing please try to detect inconsistencies in configuration and give a big honking warning explaining the problem and probably reference a page with the guidance how it can be fixed. |
Watchman is a mature, standalone file watching tool that supports Linux, macOS and Windows. Crucially, its watches can be reused by multiple clients, such as VCS tools, with no additional overhead. The current commit uses Watchman's JSON API over a Unix socket for simplicity. The more compact BSER API could be used in the future if the protocol overhead turns out to be noticeable.
This PR is ready for review. It doesn't have tests yet as those require Watchman to be installed in all CI environments, which I could work on once the prod parts have been approved. Areas of improvement left for follow-up changes:
|
@haxorz from a product point of view, we would be willing to accept this PR as long as it is clearly marked as experimental functionality (I see it is guarded behind and |
@haxorz Friendly ping |
@haxorz do you think you can attend to this review in time for Bazel 8? |
There is a recent build meetup with a talk about Watchman by @arxanas https://blog.waleedkhan.name/incremental-watchman/ It might help get folks to get up to speed on the what/how of Watchman. |
I wanted to give this a try today, and failed with getting watchman to run. The installation instructions (https://facebook.github.io/watchman/docs/install#linux) mention prebuilt debs and binaries that do no longer exist apparently. Then I tried to build it from source, but ran into this error:
This seems similar to facebook/watchman#1225 but I already have that library installed. |
I found a linux binary in this (failed) action: https://github.com/facebook/watchman/actions/runs/11827666245/job/32956253441 But sadly it is linked against |
@meisterT @sluongng to build watchman locally from source its similar to building folly or fbthrift: there is a getdeps.py script coordinating the various cmake and cargo calls (folly README has background, essentially it exists to stitch back together the various bits from the internal monorepo) First you need toolchain, you might already have these installed, but just in case:
Then the watchman build steps:
Once you have binaries, optionally run tests. State at time of writing: clean pass on centos stream 9 and fedora 40, 2 erroring tests on ubuntu 22.04 that you'll need to take a view on if important to you.
now you are getdeps.py expert prepared to build fbthrift, folly, watchman et al, and hopefully have a working binary! let me know if this works for you or not, and I'll see if I can help update the watchman docs. I don't have time to debug/fix the test fails on ubuntu at the moment, but if you do I can probably help review PRs |
Watchman is a mature, standalone file watching tool that supports Linux, macOS and Windows. Crucially, its watches can be reused by multiple clients, such as VCS tools, with no additional overhead.
The current commit uses Watchman's JSON API over a Unix socket or Windows named pipe for simplicity. The more compact BSER API could be used in the future if the protocol overhead turns out to be noticeable.