Skip to content

Commit

Permalink
docs: Add a design doc for Langauge Support for notebook
Browse files Browse the repository at this point in the history
Added a design document for scalameta/metals-feature-requests#236
This document is for
- design review the architecture before going further
- make sure we are on the same page about the goal, design, and
  requirements for this feature
  • Loading branch information
tanishiking committed Aug 31, 2022
1 parent 6ab6649 commit ccd1bed
Showing 1 changed file with 303 additions and 0 deletions.
303 changes: 303 additions & 0 deletions docs/design/language-support-notebook.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,303 @@
Language support for VSCode notebook

# Overview

We can run Scala programs on Jupyter Notebooks using a Jupyter Kernel for Scala called [Almond](https://almond.sh/).

Currently, when using notebook cells in VSCode, Metals cannot provide language support such as code completion (by Metals) and code navigations.

This design document describes how Metals can support many language features for the notebook environment in VSCode

# Scope

The scope of this document is limited to **notebook in VSCode**. It doesn't cover LSP support for jupyter notebook in general (like integration with [jupyter-lsp](https://github.com/jupyter-lsp/jupyterlab-lsp)).

However, supporting the notebooks in VSCode would be a good first step for integrations with other notebook environments.

## Goal

- When users running Almond Kernel in VSCode, users can enjoy language features that Metals provides.
- All the `textDocument/*` requests should be properly handled.
- (Maybe we can omit some features for notebook (like `textDocument/foldingRange` may be useless in notebook) though).

## Non-Goal

- Provide language support for other notebook environments other than VSCode notebook (such as `jupyter-lsp`).
- Automatically set up Almond Kernel based on the project setting
- Install Almond Kernel when users open a `*.ipynb` file and no Scala Kernel is available.
- Enable Almond to use project-defined components from notebooks.
- If we have `A.scala` adjacent to `B.ipynb`, `B.ipynb` should be able to `import` something from `A.scala`.

# Context

## LSP 3.17 NotebookCellTextDocumentFilter

Starting from LSP 3.17, LSP clients can recognize notebook cells as documents and request `textDocument/*` to LSP servers.
For example, the following change unlocks the `textDocument/*` requests for jupyter-notebook (of Scala cells) in VSCode.

```diff
diff --git a/src/extension.ts b/src/extension.ts
index 13b49f6f..32a99c36 100644
--- a/src/extension.ts
+++ b/src/extension.ts
@@ -291,6 +291,10 @@ function launchMetals(
{ scheme: "file", language: "java" },
{ scheme: "jar", language: "scala" },
{ scheme: "jar", language: "java" },
+ {
+ notebook: { scheme: "file", notebookType: "jupyter-notebook" },
+ language: "scala",
+ },
],
synchronize: {
configurationSection: "metals",
```

see: https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#notebookCellTextDocumentFilter

Now, notebook on VSCode starts sending the `textDocument/*` request to LSP server.

```json
[Trace - 01:09:28 PM] Received request 'textDocument/completion - (140)'
Params: {
"context": {
"triggerKind": 2,
"triggerCharacter": "."
},
"textDocument": {
"uri": "vscode-notebook-cell:/Users/tanishiking/src/github.com/tanishiking/scala3-playground/src/main/scala/Untitled-1.ipynb#W2sZmlsZQ%3D%3D"
},
"position": {
"line": 0,
"character": 2
}
}
```

## LSP 3.17 Notebook Document syncronization

In addition to `NotebookCellTextDocumentFilter`, LSP 3.17 provides a feature called Notebook Document syncronization.

see: https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#notebookDocument_synchronization

By setting server capabilities in `scalameta/metals`

```diff
diff --git a/metals/src/main/scala/scala/meta/internal/metals/MetalsLanguageServer.scala b/metals/src/main/scala/scala/meta/internal/metals/MetalsLanguageServer.scala
index 00aa1d791f..35b0ea2dc9 100644
--- a/metals/src/main/scala/scala/meta/internal/metals/MetalsLanguageServer.scala
+++ b/metals/src/main/scala/scala/meta/internal/metals/MetalsLanguageServer.scala
@@ -860,6 +860,16 @@ class MetalsLanguageServer(
ServerCommands.all.map(_.id).asJava
)
)
+ val selector = new NotebookSelector()
+ selector.setNotebook(
+ JEither.forLeft[String, NotebookDocumentFilter]("*")
+ )
+ selector.setCells(List(new NotebookSelectorCell("scala")).asJava)
+ capabilities.setNotebookDocumentSync(
+ new NotebookDocumentSyncRegistrationOptions(
+ List(selector).asJava
+ )
+ )
capabilities.setFoldingRangeProvider(true)
capabilities.setSelectionRangeProvider(true)
capabilities.setCodeLensProvider(new CodeLensOptions(false))
```

Notebook LSP client start sending several notifications such as `notebookDocument/didOpen`, and `notebookDocument/didChange`.
LSP 3.17 expects LSP servers to syncronize the notebook contents by these notifications, and provide language features based on the synchronized content.

# Problems

Though LSP 3.17 provides some specifications around notebook support, it doesn't work out of the box with Metals for mainly two reasons.

## Metals doesn't understand the Scala program in a cell

As you found, LSP client (VSCode) sends `textDocument/*` request with

- `uri` points to the notebook cell.
- `position` represents the position in the cell.

```
"textDocument": {
"uri": "vscode-notebook-cell:/Users/tanishiking/src/github.com/tanishiking/scala3-playground/src/main/scala/Untitled-1.ipynb#W2sZmlsZQ%3D%3D"
},
"position": {
"line": 0,
"character": 2
}
```

However, Metals can't understand each cell of notebook. For example, in the following setting, Metals can't handle cell2 and cell3 (in Scala2, Metals can't cell1 too). Because each cell doesn't compile.

```scala
// cell1
def add(a: Int, b: Int): Int = a + b

// cell2
add(1, 2)

// cell3
import $ivy.`io.circe::circe-generic:0.11.1`
```

**In order to analyze Scala code and provide compiler-based language features, Metals have to convert cells into compilable Scala program.**

## Metals server cannot recognise `*.ipynb` file as a compile target

Metals server provide language features to the files that are recognized as a `buildTarget/sources` of [Build Server Protocol](https://build-server-protocol.github.io/docs/specification.html#build-target-sources-request).

However, current Metals doesn't recognize `*.ipynb` (or combined Scala source, that we'll mention later) for Build Servers.

**Metals have to run build server that recognizes `*.ipynb` files as a `SourceItem`**. Otherwise, Metals cannot tell which Scala version of presentation compiler should run against the notebook (Kernel).

# Solution / Technical Architecture

In order to provide language support for notebooks, two things are needed.

- Convert cells to compilable Scala sources.
- Also, Metals have to convert the incoming request parameter (for cells) into the parameter for the (converted) compilable Scala source (back and forth).
- Converted sources have to be on the filesystem, because Metals expect sources to be on the filesystem.
- Build Server that recognizes `*.ipynb` as a build target.
- Where `ScalaBuildTarget.scalaVersion` should match with the Scala version of the running Kernel.
- When users open a notebook with Scala Kernel, Metals should prompt running the Build Server for the notebook, like Metals does for Ammonite script.

## Components

- `Almond BSP`
- Run on the same process with `Almond Kernel`, but the I/O will be via unix domain socket at a pre-defined location (somewhere in `~/.local/share/almond/bsp/{kernel-name}.sock`). See: https://github.com/scalameta/metals/pull/4253#issuecomment-1222091839
- `buildTarget/compile` will convert cells into compilable Scala sources (as Almond Kernel / Ammonite does) and **write the generated Scala sources onto the file system** (as Ammonite Build Server does). Also, `buildTarget/compile` should enable SemanticDB plugin.
- Answer `workspace/buildTargets`. One notebook - one buildTarget.
- Scala version should be match with Kernel's one.
- Accept `buildTarget/scalacOptions` and `ScalacOptions.classDirectory` should point to the directory where Metals can find the Almond-generated Scala sources.
- (optional) instead, we can support `workspace/wrappedSources` as [ScalaCli does](https://github.com/VirtusLab/scala-cli/blob/6efbefb1d864c0ee36156f9ac8489d0e14ee54c4/modules/scala-cli-bsp/src/main/java/scala/build/bsp/ScalaScriptBuildServer.java#L9-L10)
- `VSCode`
- Notebook LSP client, send `textDocument/*` requests to `Metals` and receive responses.
- Notify `notebookDocument/didOpen|didChange` to Metals server.
- `Metals`
- When Metals receive `notebookDocument/didOpen`, connect to Almond BSP and register the result of `workspace/buildTargets`.
- Receive `textDocument/*` request and convert the params to the generated Scala source. And then convert the result back to the cell positions.

## System flow

### When VSCode start running Almond Kernel on VSCode

```mermaid
sequenceDiagram
participant AlmondKernel
actor VSCode
participant Metals
participant AlmondBSP
VSCode->>AlmondKernel: (run and) connect to the kernel
VSCode->>Metals: notebookDocument/didOpen
Metals->>AlmondBSP: connect to AlmondBSP
Metals->>AlmondBSP: workspace/buildTargets
AlmondBSP->>Metals: buildTarget for the notebook
```

- 1. When Almond Kernel started, Almond Kernel process will create a sock file (for Almond BSP) in a pre-defined position.
- 2. When VSCode open a notebook, Metals will receive `notebookDocument/didOpen`, then Metals will try to connect to the Almond BSP via the sock file created at (1)
- Concern: Metals may have to wait until the sock file is created (?)
- 3. Then Metals asks `workspace/buildTargets` to Almond BSP, and Almmond BSP answers.
- 4. Register the mapping URI of notebook cell -> generated Scala sources to [MappedSource](https://github.com/scalameta/metals/blob/626b2a65199ea39ffca68f1e6330b67b9142a1fe/metals/src/main/scala/scala/meta/internal/metals/TargetData.scala#L303-L308)
- see: https://github.com/scalameta/metals/blob/626b2a65199ea39ffca68f1e6330b67b9142a1fe/metals/src/main/scala/scala/meta/internal/metals/ammonite/Ammonite.scala#L108-L129
- It would be even trickier than Almond one.

### When editing notebook

```mermaid
sequenceDiagram
actor VSCode
participant Metals
participant AlmondBSP
VSCode->>Metals: notebookDocument/didChange
Metals->>AlmondBSP: buildTarget/compile
```

- 1. Every time users edit notebook, Metals receive `notebookDocument/didChange`.
- 2. Then Metals will run `buildTarget/compile` to Almond BSP for generating the compilable Scala source files and generate SemanticDBs.
- I guess it would be required because users usually keep coding on Notebook without saving.
- Maybe we want to debounce the `buildTarget/compile`

## When VSCode request `textDocument/*`

- We can convert notebook cell's URI `<->` generated Scala source for the cell, using `MappedSource`.
- Conversion between position in the cell `<->` position in the generated Scala source can be done as the same way with Almond script.

# Alternative Solution

## Can we just combine cells into Ammonite script, instead of communicating with Almond BSP?

While Almond can generate accurate Scala source program, it takes many steps for providing notebook support.

Instead of using Almond, Metals can combine cells into `*.sc` file.

For example, given the following notebook cells,

```scala
// cell1
import $ivy.`io.circe::circe-generic:0.11.1`

// cell2
def add(a: Int, b: Int): Int = a + b

// cell3
add(1, 2)
```

Metals can combine those cells into `combined.sc`. The combine source will be a valid Ammmonite script file.
Now Metals can re-use the Ammonite BSP for the language support.

```scala
import $ivy.`io.circe::circe-generic:0.11.1`
def add(a: Int, b: Int): Int = a + b
add(1, 2)
```

- pros
- We don't need to implement BSP for Almond, and re-use the Ammonite BSP.
- We will have `one notebook <-> one generated scala source`, that might be easier to handle (while Almond BSP will `one notebook cell <-> one generated scala source`).
- cons
- It doesn't work if the program contains Almond-pre-imported things or [User APIs](https://almond.sh/docs/api).
- We have to run the same Scala version for Ammonite BSP with Almond Kernel, but it complicates the [Ammonite BSP logic](https://github.com/scalameta/metals/blob/fa03b00ad31307763979a00c4ccf794a2054c3fe/metals/src/main/scala/scala/meta/internal/metals/ammonite/Ammonite.scala#L184-L191) a bit.
- It adds up one more layer of conversion: `cell <-> combined sc file <-> Ammonite generated scala file`.

## Run Almond BSP only when it's from VSCode

Instead of running Almond BSP all the time, maybe we can run Almond BSP only when it's from VSCode?
I don't think we can, because:

- Looks like Kernel don't have information who is the clients, AFAIK.
- Kernel may accept connection from multiple clients. Kernel started from VSCode doesn't mean it works only with Metals.

## Almond BSP shouldn't accept `buildTarget/compile`?

Instead of accepting `buildTarget/compile` and generate Scala sources on notebook changes, we can let Kernel compile Scala sources (enabling SemanticDB and writing out the generated Scala sources to the filesystem).

It works only if Almond Kernel can detect it's running for Metals, but it looks like impossible as mentioned in the above section.

# Milestones

- [ ] Implement bare-minimum Almond BSP
- Create a sock file in a pre-defined location.
- Accept `build/initialize`
- `workspace/buildTargets` should returns the appropriate Scala version
- [ ] Metals can connect to Almond BSP
- When Metals open `*.ipynb` with `notebookDocument/didOpen`, connect to Almond Build Server, and handshake.
- [ ] Almond BSP can compile notebook cells
- Almond should accept `buildTarget/compile`
- Write out the generated Scala sources to a pre-defined location (based on base directory).
- Accept `buildTarget/scalacOptions` and `ScalacOptions.classDirectory` should point to the directory where Metals can find the Almond-generated Scala sources.
- [ ] Metals invoke `buildTarget/compile` for every notebook changes
- (optional) Metals should debounce the incoming requests
- [ ] Metals should register MappedSource between `notebook cell URI <-> generated Scala source (found from ScalacOptions.classDirectory)`.
- [ ] Metals can handle `textDocument/*` requests using presentation compiler.
- [ ] Metals can handle `textDocument/*` requests using SemanticDB
- Almond should generate SemanticDB on `buildTarget/compile`.

# Concerns

- Almond will create a sock file for Almond BSP even if users won't use Metals.
- Metals may receive `notebookDocument/didOpen` before Almond BSP starts (?)

0 comments on commit ccd1bed

Please sign in to comment.