Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add file system support for jars #3750

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Arthurm1
Copy link
Contributor

An experiment in using the VSCode FileSystem API to explore library dependencies

Here's an example....
image

Stuff that works...

  1. auto population of libraries on indexing complete
  2. file exploring
  3. file opening
  4. breadcrumbs

Stuff that doesn't work...

  1. IDE functionality - i.e. Metals doesn't recognise these as scala (or java) files (because it doesn't understand the URI)
  2. decompiling of class files from non-source jars - shouldn't be hard to add
  3. searching using VSCode search - doesn't look possible
  4. file explorer <-> current edited document sync - unsure yet

Weird stuff

  1. the user can easily just delete the virtual Metals - libraries workspace folder. Can't see a way to stop this. It doesn't cause an error, it's just that the functionality is then lost. It would be re-added on Reload Window

FileSystem API uses URIs to reference files. With the flat approach taken here (i.e. root/all_jars/XXX path) the real jar URIs will not match with the flat URIs.
So metalslibs:/scala-library-2.13.8-sources.jar!/scala/io/source.scala != jar:file:///localRepoPath/scala-library-2.13.8-sources.jar!/scala/io/source.scala
Because of this VSCode thinks these are 2 different files. Metals also doesn't recognise the former URI and code would have to be added to interpret this URI in the same way the readonly directory is handled. Even so - VSCode would still think these were different files.

Another option would be to take a non-flat structure and have the real directory structure leading to all the jars (albeit only showing dirs with jars in). I haven't checked if this will actually work as VSCode corrupts URIs given to it (e.g. root has to == / when I'd like it to be file:///).
This probably wouldn't be as easy to use and I'm still exploring if it's possible.

@kpodsiad Here's a working, but limited, jar explorer.

@tgodzik
Copy link
Contributor

tgodzik commented Mar 23, 2022

Wow! This looks awesome! This could replace the packages view in Metals tree view if I understand correctly? Does this show up in the file explorer?

IDE functionality - i.e. Metals doesn't recognise these as scala (or java) files (because it doesn't understand the URI)

I would say this is the only blocking thing, very similar to the notebook issue where we have another URI pattern that is not possible for Metals to recognise.

decompiling of class files from non-source jars - shouldn't be hard to add

I think this doesn't work currently also.

searching using VSCode search - doesn't look possible

It's not possible currently, so this would be fine.

file explorer <-> current edited document sync - unsure yet

Do you mean the focused document?

the user can easily just delete the virtual Metals - libraries workspace folder. Can't see a way to stop this. It doesn't cause an error, it's just that the functionality is then lost. It would be re-added on Reload Window

IS there any update sent to Metals about it? Maybe we could just show an error message and ask to reload?

@kpodsiad
Copy link
Member

Having breadcrumbs in dependency files seems awesome @Arthurm1 !

However, after a bit of plating, I spotted one issue. Namely, the fact that the user has to deal with an additional workspace may be problematic:

  • one can close the newly created workspace and it'll disappear. As a possible workaround, we can add client command like enable Metals libraries but it doesn't look like proper UX to me
  • adding a new workspace requires saving .workspace config file

I wonder if we could implement this feature in a more transparent way from the user perspective and it seems like we could

This could replace the packages view in Metals tree view if I understand correctly? Does this show up in the file explorer?

After looking at https://marketplace.visualstudio.com/items?itemName=sourcegraph.sourcegraph and https://github.com/olafurpg/sourcegraph-vscode/blob/diffs-view/src/file-system/FilesTreeDataProvider.ts it seems like we could replace Metals tree view and have breadcrumbs without adding a new workspace folder.

I will even move the package view from Metals tab to the Explorer tab and place it just after the files tab.

@Arthurm1
Copy link
Contributor Author

@tgodzik

Wow! This looks awesome! This could replace the packages view in Metals tree view if I understand correctly? Does this show up in the file explorer?

Yes - it appears in the main file explorer as a new workspace folder below the main workspace folder. It could appear at the top but when you change the first workspace folder then VSCode restarts the extensions which isn't a great UI feel.
I'm not sure whether it could replace the packages view as that explores classes/objects etc rather than files so it's a different approach. You can always open the file and look in the outline view on VSCode but you may have an issue if you want to look at a specific class and you don't know which file it's in.

IDE functionality - i.e. Metals doesn't recognise these as scala (or java) files (because it doesn't understand the URI)

I would say this is the only blocking thing, very similar to the notebook issue where we have another URI pattern that is not possible for Metals to recognise.

I agree - this is the main blocker. I'm starting to think that using the real jar URI jar:file///foo/../bar.jar!/some/class.scala in communication with VSCode limits what Metals can do. Instead I'm thinking Metals should have it's own URI format for jars and decoded files e.g. metals:/bar.jar/some/class.scala when interacting with the client. Then we have full control over what the document is (e.g. decoded class file) and where it is (e.g. deep in a coursier directory). LSP entrypoints could translate this URI into whatever Metals needs. Then we wouldn't be limited in how we presented the file explorer and how we wanted to decode files (i.e. AbsolutePath cannot cope with a non jar | file uri so doesn't like a link to metalsdecode:///foo.class.cfr). I'm still mulling this over.

decompiling of class files from non-source jars - shouldn't be hard to add

I think this doesn't work currently also.

No but you can navigate directly to a class file through the file explorer now so it would be good to implement it and it's fairly simple since one of the methods you have to implement on the filesystem api is readFile and every file open goes through that method

file explorer <-> current edited document sync - unsure yet

Do you mean the focused document?

I mean when you open a jar document (e.g. through goto reference) you should be able to run the reveal active file in side bar command and see it in the file explorer. Doesn't work yet because the URIs are different format. It's the same issue as the main blocker above.

the user can easily just delete the virtual Metals - libraries workspace folder. Can't see a way to stop this. It doesn't cause an error, it's just that the functionality is then lost. It would be re-added on Reload Window

IS there any update sent to Metals about it? Maybe we could just show an error message and ask to reload?

I'll check - probably.

@Arthurm1
Copy link
Contributor Author

@kpodsiad

After looking at https://marketplace.visualstudio.com/items?itemName=sourcegraph.sourcegraph and https://github.com/olafurpg/sourcegraph-vscode/blob/diffs-view/src/file-system/FilesTreeDataProvider.ts it seems like we could replace Metals tree view and have breadcrumbs without adding a new workspace folder.

Having a custom tree view is another option. It would certainly get round the issue of URIs not matching between the file explorer and Metals as it wouldn't rely on the URI to match the tree.
Because this is a read-only filesystem, we don't take advantage of the full functionality of the Filesystem api (copying, deleting, watching etc.) so using it could be overkill.
On the other side - if a client supports the FileSystem api then it's trivial for it to support the jar explorer.

Maybe I should give the custom tree view a go and compare the results

@Arthurm1
Copy link
Contributor Author

Changing from a flat file structure to a real representation of where the jars are located doesn't look great...
image

@Arthurm1
Copy link
Contributor Author

I'm going to close this since the way the FileSystem API makes the tree UI exactly reflect the URI makes it unworkable for the way Metals wants to use URIs and the way we want to display libraries.

@Arthurm1 Arthurm1 closed this Mar 26, 2022
@tgodzik
Copy link
Contributor

tgodzik commented Mar 28, 2022

I'm going to close this since the way the FileSystem API makes the tree UI exactly reflect the URI makes it unworkable for the way Metals wants to use URIs and the way we want to display libraries.

The initial approach in the PR description looked pretty awesome to me, why not revert to that?

@Arthurm1
Copy link
Contributor Author

@tgodzik I closed this because I couldn't see an easy way to match off the 2 different URI formats (i.e. the one Metals uses to identify classes in jars and the one the VSCode filesystem api uses). Instead I thought this would be better implemented using a custom treeview

But now I see that breadcrumbs only appear when the filesystem supports them so using the filesystem API is the only way to get breadcrumbs on jar dependencies. see here

Currently Metals understands dependencies when they are in readonly dir or when referred to using the full uri:
jar:file:///jarPath/jarName-source.jar!/somePackage/someFile.scala

I'm going to look at changing the way Metals understands jar URIs to match the URI on the filesystem API. So something like:
jar:/jarName-source.jar/somePackage/someFile.scala

Note the lack of file:, the lack of jarPath and the lack of !

This should be viable because it will only affect clients that support virtual documents. The content of the virtual documents is supplied by Metals so the format of the URI shouldn't matter to the client.

The issue I think I'm going to run up against is that AbsolutePath is used to reference documents everywhere in Metals and these won't be valid paths. In fact, if I need to use metalsDecode scheme to decompile .class files then they won't be valid at all as Java only supports file and jar schemes. I've already run into this issue here which meant it wasn't possible to use goto or find references in a decompiled class file

I think there are 2 options here:

  1. Change uses of AbsolutePath to AbsoluteURI in the codebase. This would be pretty invasive, mean a lot of code changes and I can't tell at this stage if it would deliver everything wanted here. The reason it would be needed is that AbsolutePath wraps nio.Path which forces a validity check on the jar everytime you convert it to a URI i.e. that jar file must exist on that path on the filesystem and it won't because of the lack of file:, jarPath and ! in the URI.
  2. Create a java FileSystem for a new Metals scheme leading to something like metals:/jarName/package/file.class

I'm leaning towards the 2nd option as it wouldn't affect any other code and should cover both source files and class files that need decompiling.
There is an example implementation of a GithubFS here

Unless you can think of an easier option.

@Arthurm1 Arthurm1 reopened this Mar 29, 2022
@Arthurm1 Arthurm1 mentioned this pull request Apr 6, 2022
@tgodzik
Copy link
Contributor

tgodzik commented Apr 13, 2022

@Arthurm1 I recently got back from vacations and only know managed to take a look at your response, sorry!

Looks like a second solution is better, though I am still trying to figure out if we can make it any simpler 🤔

@Arthurm1
Copy link
Contributor Author

@tgodzik No worries - glad you've had a break.

I haven't been able to think of another option so I've gone ahead with a MetalsFileSystem/MetalsFileSystemProvider/MetalsPath and it seems to work quite well.

I've separated out jdks, source jars, workspace jars into subdirs (but that's just my preference - Metals can now dictate the folder structure in the file explorer)

The user can navigate using goto/refs, file explorer or breadcrumbs and they all just sync.

I've (nearly) got auto-decompiling and interactive semanticdb working for class files so you can navigate non-source jars as though they were java files.

Still a lot to do though and I have to support readonly directory for those non-VSCode legacy clients 😉

metals-1649865855473.mp4

@ckipp01
Copy link
Member

ckipp01 commented Apr 13, 2022

non-VSCode legacy clients 😉

whoa whoa whoa 😄

@Arthurm1
Copy link
Contributor Author

I think this is now almost functionally complete. There's a lot of code I need to tidy and many tests to write but it would be good to get some feedback on whether the approach is good. By that I mean both how the UI works and whether the custom URL handler and custom filesystem I've had to write are overkill or disrupt something I haven't thought of. There's probably some optimisation needed as well.

This is linked to scalameta/metals-vscode#963 if you'd like to try it out. Please do.

After initial indexing it should populate the VSCode file explorer with a new file system root which may (on the first time only) cause your VSCode to reload. I'm not sure yet whether this is avoidable.
Then you should be able to treat dependencies as though they are read only files on your system. That includes the JDK and the non-source dependencies. So sync between active file and explorer and use breadcrumbs.
Non-source dependencies should be navigable as the .class files are now auto-decompiled by CFR.
You should still be able to add breakpoints and debug as the file system paths should get translated for the scala-debug-adapter
Find in jars works but is slower and I probably need to look at that.
JDK lines in stacktraces don't seem to be highlighted - so I need to look at that.

For non-vscode clients, the readonly stuff should work as usual and they shouldn't even be aware that metals is using a metalsfs: filesystem instead of jar:. One benefit of this PR is that they should now also have the auto-decompilation of class files for jars with no sources.

@Arthurm1
Copy link
Contributor Author

Arthurm1 commented May 5, 2022

@tgodzik When you have a moment - could you have a look through this and see whether you're happy with the approach regarding custom java filesystem and custom url handler. You may decide it's too invasive. It's not urgent. It's also easy enough to try out yourself with the vscode PR as well. Code's a bit of a mess but I thought I'd stop here in case you're dead against it.

@tgodzik
Copy link
Contributor

tgodzik commented May 5, 2022

@tgodzik When you have a moment - could you have a look through this and see whether you're happy with the approach regarding custom java filesystem and custom url handler. You may decide it's too invasive. It's not urgent. It's also easy enough to try out yourself with the vscode PR as well. Code's a bit of a mess but I thought I'd stop here in case you're dead against it.

It's on my list! I will take a look 😅

@tgodzik
Copy link
Contributor

tgodzik commented May 6, 2022

It works really great!

There is a couple of issues I noticed:

  • each time I open a workspace I get an error
    image
    maybe we could create the additional filesystem only after indexing? I think currently we send it right away.

  • it seems that each time I turn off the workspace I get a warning about saving the workspace configuration. Could we avoid it? I can take a look into it next week.

  • the Tasks output is focused on open each time.

I demoed the approach for a couple of people and they seemed really enthusiastic about it.

@tgodzik
Copy link
Contributor

tgodzik commented May 6, 2022

  • it seems that each time I turn off the workspace I get a warning about saving the workspace configuration. Could we avoid it? I can take a look into it next week.

Maybe we could remove the added folder in deactivate function? That should potentially work I think.

@Arthurm1
Copy link
Contributor Author

  • each time I open a workspace I get an error
    image
    maybe we could create the additional filesystem only after indexing? I think currently we send it right away.

How are you reproducing that - I can't seem to.

If I leave a jar file open and then restart VSCode I do get this...
image
But then when Metals has initialised, I can click on Try Again and it will load. This is expected.

I've played with when the folder is added in VSCode. It's a bit ugly because when you transition from a single root to multiple folder setup, VSCode will restart. This is nothing to do with Metals - it happens if you just manually add an additional folder to VSCode when you only have 1 root folder. Because of that, Metals starts from scratch again.
If I wait for indexing to finish before adding the folder then Metals has to perform indexing twice which is slow. So I think it's best to add the folder on activation of Metals (which causes an immediate restart) and then register the filesystem provider when indexing is complete. So the folder just sits there with an error icon until the provider is registered.
This could be sped up further as we don't need the jars indexed, we only need the list of dependency and source jars to be retrieved from BSP which is currently conflated with indexing. This could be separated out and the libraries folder would be functional sooner.
Also - if you do save the workspace - there is no restarting of VSCode each time.

  • it seems that each time I turn off the workspace I get a warning about saving the workspace configuration. Could we avoid it? I can take a look into it next week.

This appears to be how VSCode works when you switch to a multi-folder workspace. I can't see a way round it. You don't have to save the workspace but you'll be continually prompted on VSCode exit if you don't

  • the Tasks output is focused on open each time.

Have you saved the workspace when Tasks output was focused? VSCode will remember that and open like that every time. Try re-saving the workspace with a different output focused or the bottom panel closed to test this.

  • Maybe we could remove the added folder in deactivate function? That should potentially work I think.

I've tried this and can't get it to work. I think VSCode is shutting a lot of itself down in parallel with extension deactivation and it's not possible to alter the workspace folders at that time. I can't think of any nice way to remove the libraries folder at the end of a session.

We could add an option to not add the folder to VSCode - just in case users are unhappy with the switch to multi-root workspace. Navigation, bread crumbs and auto class decompilation would still work. There would just be no file exploring.

@tgodzik
Copy link
Contributor

tgodzik commented May 25, 2022

How are you reproducing that - I can't seem to.

I will try to see what's going on!

This appears to be how VSCode works when you switch to a multi-folder workspace. I can't see a way round it. You don't have to save the workspace but you'll be continually prompted on VSCode exit if you don't

Damn, it would be good to be able to be figure out a way around it. I will take a look.

This could be sped up further as we don't need the jars indexed, we only need the list of dependency and source jars to be retrieved from BSP which is currently conflated with indexing. This could be separated out and the libraries folder would be functional sooner.

That could be a good idea, I can also take a look if you don't have time?

Have you saved the workspace when Tasks output was focused? VSCode will remember that and open like that every time. Try re-saving the workspace with a different output focused or the bottom panel closed to test this.

I will investigate this as soon as I can.

Thanks for the great work! I showed your work in a couple places and people would love to have it I think 🎉

@Arthurm1
Copy link
Contributor Author

@tgodzik I haven't had much time recently and I'm off on holiday for a few weeks. I don't think there's much to do in terms of functionality - just some finessing on the points you've raised. I have got a chunk of tests to write though which I'm slowly getting through. 😞

@tgodzik
Copy link
Contributor

tgodzik commented Jun 2, 2022

@Arthurm1 actually, could we avoid adding the workspace. Everything works really nicely without it and I don't get any issues, any at all really. Using Tree View works pretty well in the case and we could just make it work with reveal command. We wouldn't see files, but I think it's overall a better experience. And we can improve the tree view further to also show jars etc.

Btw. I can finish up with the tests if you don't find the time.

Anyway, this is pretty exciting!

@Arthurm1
Copy link
Contributor Author

actually, could we avoid adding the workspace. Everything works really nicely without it and I don't get any issues, any at all really. Using Tree View works pretty well in the case and we could just make it work with reveal command. We wouldn't see files, but I think it's overall a better experience. And we can improve the tree view further to also show jars etc.

@tgodzik I agree. I think the switch to multi-workspace when adding Metals Libraries in File explorer is quite jarring from a usability point of view. I was wondering about adding a show libraries in explorer command - local only to vscode client. Then if users want to use file explorer they can but they're not forced into it. Sadly the build-in vscode Add folder to workspace command doesn't allow URI schemes other than file:

@tgodzik
Copy link
Contributor

tgodzik commented Jun 10, 2022

I was wondering about adding a show libraries in explorer command - local only to vscode client.

That should be fine as long as it's not the default and we are not forcing the users. And it might be pretty useful to some. I like the experience of the separate file system provider. Would be cool to replicate this in tree view, but not sure if that is currently possible.

@jeengbe
Copy link

jeengbe commented May 2, 2024

Hi! What's the status for this change?

@Arthurm1
Copy link
Contributor Author

Arthurm1 commented May 2, 2024

@jeengbe I haven't looked at it in a long time. I got the performance issues sorted but still had issues with URLClassLoader if I remember right. Code's pretty old now though, compared with the current Metals codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experiment Ticket related to an experiment or experimental features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants