From ccd1bed3a6eabe0d67d401be18ba348d1c494f3a Mon Sep 17 00:00:00 2001 From: Rikito Taniguchi Date: Tue, 30 Aug 2022 19:01:52 +0900 Subject: [PATCH] docs: Add a design doc for Langauge Support for notebook Added a design document for https://github.com/scalameta/metals-feature-requests/issues/236 This document is for - design review the architecture before going further - make sure we are on the same page about the goal, design, and requirements for this feature --- docs/design/language-support-notebook.md | 303 +++++++++++++++++++++++ 1 file changed, 303 insertions(+) create mode 100644 docs/design/language-support-notebook.md diff --git a/docs/design/language-support-notebook.md b/docs/design/language-support-notebook.md new file mode 100644 index 00000000000..3ca1b433af0 --- /dev/null +++ b/docs/design/language-support-notebook.md @@ -0,0 +1,303 @@ +Language support for VSCode notebook + +# Overview + +We can run Scala programs on Jupyter Notebooks using a Jupyter Kernel for Scala called [Almond](https://almond.sh/). + +Currently, when using notebook cells in VSCode, Metals cannot provide language support such as code completion (by Metals) and code navigations. + +This design document describes how Metals can support many language features for the notebook environment in VSCode + +# Scope + +The scope of this document is limited to **notebook in VSCode**. It doesn't cover LSP support for jupyter notebook in general (like integration with [jupyter-lsp](https://github.com/jupyter-lsp/jupyterlab-lsp)). + +However, supporting the notebooks in VSCode would be a good first step for integrations with other notebook environments. + +## Goal + +- When users running Almond Kernel in VSCode, users can enjoy language features that Metals provides. + - All the `textDocument/*` requests should be properly handled. + - (Maybe we can omit some features for notebook (like `textDocument/foldingRange` may be useless in notebook) though). + +## Non-Goal + +- Provide language support for other notebook environments other than VSCode notebook (such as `jupyter-lsp`). +- Automatically set up Almond Kernel based on the project setting + - Install Almond Kernel when users open a `*.ipynb` file and no Scala Kernel is available. +- Enable Almond to use project-defined components from notebooks. + - If we have `A.scala` adjacent to `B.ipynb`, `B.ipynb` should be able to `import` something from `A.scala`. + +# Context + +## LSP 3.17 NotebookCellTextDocumentFilter + +Starting from LSP 3.17, LSP clients can recognize notebook cells as documents and request `textDocument/*` to LSP servers. +For example, the following change unlocks the `textDocument/*` requests for jupyter-notebook (of Scala cells) in VSCode. + +```diff +diff --git a/src/extension.ts b/src/extension.ts +index 13b49f6f..32a99c36 100644 +--- a/src/extension.ts ++++ b/src/extension.ts +@@ -291,6 +291,10 @@ function launchMetals( + { scheme: "file", language: "java" }, + { scheme: "jar", language: "scala" }, + { scheme: "jar", language: "java" }, ++ { ++ notebook: { scheme: "file", notebookType: "jupyter-notebook" }, ++ language: "scala", ++ }, + ], + synchronize: { + configurationSection: "metals", +``` + +see: https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#notebookCellTextDocumentFilter + +Now, notebook on VSCode starts sending the `textDocument/*` request to LSP server. + +```json +[Trace - 01:09:28 PM] Received request 'textDocument/completion - (140)' +Params: { + "context": { + "triggerKind": 2, + "triggerCharacter": "." + }, + "textDocument": { + "uri": "vscode-notebook-cell:/Users/tanishiking/src/github.com/tanishiking/scala3-playground/src/main/scala/Untitled-1.ipynb#W2sZmlsZQ%3D%3D" + }, + "position": { + "line": 0, + "character": 2 + } +} +``` + +## LSP 3.17 Notebook Document syncronization + +In addition to `NotebookCellTextDocumentFilter`, LSP 3.17 provides a feature called Notebook Document syncronization. + +see: https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#notebookDocument_synchronization + +By setting server capabilities in `scalameta/metals` + +```diff +diff --git a/metals/src/main/scala/scala/meta/internal/metals/MetalsLanguageServer.scala b/metals/src/main/scala/scala/meta/internal/metals/MetalsLanguageServer.scala +index 00aa1d791f..35b0ea2dc9 100644 +--- a/metals/src/main/scala/scala/meta/internal/metals/MetalsLanguageServer.scala ++++ b/metals/src/main/scala/scala/meta/internal/metals/MetalsLanguageServer.scala +@@ -860,6 +860,16 @@ class MetalsLanguageServer( + ServerCommands.all.map(_.id).asJava + ) + ) ++ val selector = new NotebookSelector() ++ selector.setNotebook( ++ JEither.forLeft[String, NotebookDocumentFilter]("*") ++ ) ++ selector.setCells(List(new NotebookSelectorCell("scala")).asJava) ++ capabilities.setNotebookDocumentSync( ++ new NotebookDocumentSyncRegistrationOptions( ++ List(selector).asJava ++ ) ++ ) + capabilities.setFoldingRangeProvider(true) + capabilities.setSelectionRangeProvider(true) + capabilities.setCodeLensProvider(new CodeLensOptions(false)) +``` + +Notebook LSP client start sending several notifications such as `notebookDocument/didOpen`, and `notebookDocument/didChange`. +LSP 3.17 expects LSP servers to syncronize the notebook contents by these notifications, and provide language features based on the synchronized content. + +# Problems + +Though LSP 3.17 provides some specifications around notebook support, it doesn't work out of the box with Metals for mainly two reasons. + +## Metals doesn't understand the Scala program in a cell + +As you found, LSP client (VSCode) sends `textDocument/*` request with + +- `uri` points to the notebook cell. +- `position` represents the position in the cell. + +``` +"textDocument": { + "uri": "vscode-notebook-cell:/Users/tanishiking/src/github.com/tanishiking/scala3-playground/src/main/scala/Untitled-1.ipynb#W2sZmlsZQ%3D%3D" +}, +"position": { + "line": 0, + "character": 2 +} +``` + +However, Metals can't understand each cell of notebook. For example, in the following setting, Metals can't handle cell2 and cell3 (in Scala2, Metals can't cell1 too). Because each cell doesn't compile. + +```scala +// cell1 +def add(a: Int, b: Int): Int = a + b + +// cell2 +add(1, 2) + +// cell3 +import $ivy.`io.circe::circe-generic:0.11.1` +``` + +**In order to analyze Scala code and provide compiler-based language features, Metals have to convert cells into compilable Scala program.** + +## Metals server cannot recognise `*.ipynb` file as a compile target + +Metals server provide language features to the files that are recognized as a `buildTarget/sources` of [Build Server Protocol](https://build-server-protocol.github.io/docs/specification.html#build-target-sources-request). + +However, current Metals doesn't recognize `*.ipynb` (or combined Scala source, that we'll mention later) for Build Servers. + +**Metals have to run build server that recognizes `*.ipynb` files as a `SourceItem`**. Otherwise, Metals cannot tell which Scala version of presentation compiler should run against the notebook (Kernel). + +# Solution / Technical Architecture + +In order to provide language support for notebooks, two things are needed. + +- Convert cells to compilable Scala sources. + - Also, Metals have to convert the incoming request parameter (for cells) into the parameter for the (converted) compilable Scala source (back and forth). + - Converted sources have to be on the filesystem, because Metals expect sources to be on the filesystem. +- Build Server that recognizes `*.ipynb` as a build target. + - Where `ScalaBuildTarget.scalaVersion` should match with the Scala version of the running Kernel. + - When users open a notebook with Scala Kernel, Metals should prompt running the Build Server for the notebook, like Metals does for Ammonite script. + +## Components + +- `Almond BSP` + - Run on the same process with `Almond Kernel`, but the I/O will be via unix domain socket at a pre-defined location (somewhere in `~/.local/share/almond/bsp/{kernel-name}.sock`). See: https://github.com/scalameta/metals/pull/4253#issuecomment-1222091839 + - `buildTarget/compile` will convert cells into compilable Scala sources (as Almond Kernel / Ammonite does) and **write the generated Scala sources onto the file system** (as Ammonite Build Server does). Also, `buildTarget/compile` should enable SemanticDB plugin. + - Answer `workspace/buildTargets`. One notebook - one buildTarget. + - Scala version should be match with Kernel's one. + - Accept `buildTarget/scalacOptions` and `ScalacOptions.classDirectory` should point to the directory where Metals can find the Almond-generated Scala sources. + - (optional) instead, we can support `workspace/wrappedSources` as [ScalaCli does](https://github.com/VirtusLab/scala-cli/blob/6efbefb1d864c0ee36156f9ac8489d0e14ee54c4/modules/scala-cli-bsp/src/main/java/scala/build/bsp/ScalaScriptBuildServer.java#L9-L10) +- `VSCode` + - Notebook LSP client, send `textDocument/*` requests to `Metals` and receive responses. + - Notify `notebookDocument/didOpen|didChange` to Metals server. +- `Metals` + - When Metals receive `notebookDocument/didOpen`, connect to Almond BSP and register the result of `workspace/buildTargets`. + - Receive `textDocument/*` request and convert the params to the generated Scala source. And then convert the result back to the cell positions. + +## System flow + +### When VSCode start running Almond Kernel on VSCode + +```mermaid +sequenceDiagram + participant AlmondKernel + actor VSCode + participant Metals + participant AlmondBSP + VSCode->>AlmondKernel: (run and) connect to the kernel + VSCode->>Metals: notebookDocument/didOpen + Metals->>AlmondBSP: connect to AlmondBSP + Metals->>AlmondBSP: workspace/buildTargets + AlmondBSP->>Metals: buildTarget for the notebook +``` + +- 1. When Almond Kernel started, Almond Kernel process will create a sock file (for Almond BSP) in a pre-defined position. +- 2. When VSCode open a notebook, Metals will receive `notebookDocument/didOpen`, then Metals will try to connect to the Almond BSP via the sock file created at (1) + - Concern: Metals may have to wait until the sock file is created (?) +- 3. Then Metals asks `workspace/buildTargets` to Almond BSP, and Almmond BSP answers. +- 4. Register the mapping URI of notebook cell -> generated Scala sources to [MappedSource](https://github.com/scalameta/metals/blob/626b2a65199ea39ffca68f1e6330b67b9142a1fe/metals/src/main/scala/scala/meta/internal/metals/TargetData.scala#L303-L308) + - see: https://github.com/scalameta/metals/blob/626b2a65199ea39ffca68f1e6330b67b9142a1fe/metals/src/main/scala/scala/meta/internal/metals/ammonite/Ammonite.scala#L108-L129 + - It would be even trickier than Almond one. + +### When editing notebook + +```mermaid +sequenceDiagram + actor VSCode + participant Metals + participant AlmondBSP + VSCode->>Metals: notebookDocument/didChange + Metals->>AlmondBSP: buildTarget/compile +``` + +- 1. Every time users edit notebook, Metals receive `notebookDocument/didChange`. +- 2. Then Metals will run `buildTarget/compile` to Almond BSP for generating the compilable Scala source files and generate SemanticDBs. + - I guess it would be required because users usually keep coding on Notebook without saving. + - Maybe we want to debounce the `buildTarget/compile` + +## When VSCode request `textDocument/*` + +- We can convert notebook cell's URI `<->` generated Scala source for the cell, using `MappedSource`. +- Conversion between position in the cell `<->` position in the generated Scala source can be done as the same way with Almond script. + +# Alternative Solution + +## Can we just combine cells into Ammonite script, instead of communicating with Almond BSP? + +While Almond can generate accurate Scala source program, it takes many steps for providing notebook support. + +Instead of using Almond, Metals can combine cells into `*.sc` file. + +For example, given the following notebook cells, + +```scala +// cell1 +import $ivy.`io.circe::circe-generic:0.11.1` + +// cell2 +def add(a: Int, b: Int): Int = a + b + +// cell3 +add(1, 2) +``` + +Metals can combine those cells into `combined.sc`. The combine source will be a valid Ammmonite script file. +Now Metals can re-use the Ammonite BSP for the language support. + +```scala +import $ivy.`io.circe::circe-generic:0.11.1` +def add(a: Int, b: Int): Int = a + b +add(1, 2) +``` + +- pros + - We don't need to implement BSP for Almond, and re-use the Ammonite BSP. + - We will have `one notebook <-> one generated scala source`, that might be easier to handle (while Almond BSP will `one notebook cell <-> one generated scala source`). +- cons + - It doesn't work if the program contains Almond-pre-imported things or [User APIs](https://almond.sh/docs/api). + - We have to run the same Scala version for Ammonite BSP with Almond Kernel, but it complicates the [Ammonite BSP logic](https://github.com/scalameta/metals/blob/fa03b00ad31307763979a00c4ccf794a2054c3fe/metals/src/main/scala/scala/meta/internal/metals/ammonite/Ammonite.scala#L184-L191) a bit. + - It adds up one more layer of conversion: `cell <-> combined sc file <-> Ammonite generated scala file`. + +## Run Almond BSP only when it's from VSCode + +Instead of running Almond BSP all the time, maybe we can run Almond BSP only when it's from VSCode? +I don't think we can, because: + +- Looks like Kernel don't have information who is the clients, AFAIK. +- Kernel may accept connection from multiple clients. Kernel started from VSCode doesn't mean it works only with Metals. + +## Almond BSP shouldn't accept `buildTarget/compile`? + +Instead of accepting `buildTarget/compile` and generate Scala sources on notebook changes, we can let Kernel compile Scala sources (enabling SemanticDB and writing out the generated Scala sources to the filesystem). + +It works only if Almond Kernel can detect it's running for Metals, but it looks like impossible as mentioned in the above section. + +# Milestones + +- [ ] Implement bare-minimum Almond BSP + - Create a sock file in a pre-defined location. + - Accept `build/initialize` + - `workspace/buildTargets` should returns the appropriate Scala version +- [ ] Metals can connect to Almond BSP + - When Metals open `*.ipynb` with `notebookDocument/didOpen`, connect to Almond Build Server, and handshake. +- [ ] Almond BSP can compile notebook cells + - Almond should accept `buildTarget/compile` + - Write out the generated Scala sources to a pre-defined location (based on base directory). + - Accept `buildTarget/scalacOptions` and `ScalacOptions.classDirectory` should point to the directory where Metals can find the Almond-generated Scala sources. +- [ ] Metals invoke `buildTarget/compile` for every notebook changes + - (optional) Metals should debounce the incoming requests +- [ ] Metals should register MappedSource between `notebook cell URI <-> generated Scala source (found from ScalacOptions.classDirectory)`. +- [ ] Metals can handle `textDocument/*` requests using presentation compiler. +- [ ] Metals can handle `textDocument/*` requests using SemanticDB + - Almond should generate SemanticDB on `buildTarget/compile`. + +# Concerns + +- Almond will create a sock file for Almond BSP even if users won't use Metals. +- Metals may receive `notebookDocument/didOpen` before Almond BSP starts (?)