Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v2.4.0+0.2.2+8.19 #50

Merged
merged 75 commits into from
Nov 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
2361b69
Implement base use of feature to check proofs. Needs fixes
K-dizzled Jul 29, 2024
9e9b6e1
Test and integrte command arg of getGoals to check proofs
K-dizzled Oct 9, 2024
415f71d
Merge main into core-lsp-rework
K-dizzled Oct 16, 2024
d5c66ac
Upgrade coq-lsp in actions to v0.2.2 for arg in getGoals
K-dizzled Oct 16, 2024
09c957e
Change type of CoqLspClient to Interface everywhere possible
K-dizzled Oct 16, 2024
e32b7f9
Clean CoqLspClientInterface
K-dizzled Oct 16, 2024
9125619
Refactor `CoqProofChecker`
K-dizzled Oct 17, 2024
ad795e7
Clean-up
K-dizzled Oct 17, 2024
872987b
Refactor creating in plugin mode: now once in
K-dizzled Oct 18, 2024
e97e1ec
Create decorator that logs execution time of the method
K-dizzled Oct 20, 2024
5baf1f8
Make computation of `initialGoal` optional
K-dizzled Oct 20, 2024
9319114
Create permanent status bar button
K-dizzled Oct 20, 2024
b2c18bf
Add refactor TODO to
K-dizzled Oct 20, 2024
3f44690
Sketch aborting the completion generation through `AbortSignal`
K-dizzled Oct 25, 2024
9689e85
Refactor `SourceFileEnvironment`: remove dirPath and rename
K-dizzled Oct 25, 2024
095b99b
Refactor `SourceFileEnvironment`: remove `fileLines`
K-dizzled Oct 25, 2024
df91211
Refactor `checkGeneratedProofs`
K-dizzled Oct 25, 2024
632c80a
Minor changes after sketching completion abort in extension
K-dizzled Oct 26, 2024
7530442
Merge branch 'core-lsp-rework' into extension-completion-abort
K-dizzled Oct 26, 2024
e56e10e
Merge pull request #45 from JetBrains-Research/extension-completion-a…
K-dizzled Oct 26, 2024
d8f591f
Fix merge of `abort-completion` into `core-lsp-refactor`
K-dizzled Oct 26, 2024
0ec7316
Add TODOs on `LspCoreRefactor`
K-dizzled Oct 26, 2024
cdd87b3
Create `SessionExtensionState`
K-dizzled Oct 27, 2024
34bdfa2
Refactor `CoqLspClient` to `CoqLspClientImpl`
K-dizzled Oct 27, 2024
bab8a26
Finalize aborting `coq-lsp` gracefully
K-dizzled Oct 27, 2024
e122675
Prepare for v2.3.1 release
K-dizzled Oct 27, 2024
4c276e9
docs: update README.md
eltociear Oct 29, 2024
848b762
Merge pull request #47 from eltociear/patch-1
K-dizzled Oct 29, 2024
33d1197
Fix cosmetics from review in PR#46
K-dizzled Nov 6, 2024
d1d4952
Fix more issues from review in PR#46
K-dizzled Nov 6, 2024
09a3420
Code style fixes and core in rankers from review in PR#46
K-dizzled Nov 6, 2024
c58c89f
Refactoring `healthStatusIndicator` according to review in PR#46
K-dizzled Nov 6, 2024
a8ac5b8
Renamings according to review in PR#46
K-dizzled Nov 6, 2024
0b36cc3
More accurate `completionAbortError` handling
K-dizzled Nov 6, 2024
f5122cd
Refactor `coqLsp` tests according to review of PR#46
K-dizzled Nov 6, 2024
75c46d3
Merge pull request #46 from JetBrains-Research/core-lsp-rework
K-dizzled Nov 6, 2024
d72fa49
Change `v2.3.1` to `v2.4.0` as changes are major
K-dizzled Nov 6, 2024
f70deac
Refactor `SessionState` & highlight vulnerabilities
GlebSolovev Nov 15, 2024
5b5abb1
Restructure theorem rankers, add comments
GlebSolovev Nov 15, 2024
e64f491
Remove redundant null-checks for theorem proof
GlebSolovev Nov 15, 2024
1d158bc
Make code style of `parsedTypes.ts` consistent
GlebSolovev Nov 15, 2024
66d0d43
Format code
GlebSolovev Nov 15, 2024
16831af
Use `ProofGoal` instead of `Goal<PpString>`
GlebSolovev Nov 15, 2024
ca16531
Implement `getFirstGoalOrThrow` wrapper properly
GlebSolovev Nov 15, 2024
34d48f3
Improve code style of coq-lsp tests
GlebSolovev Nov 15, 2024
8b3a621
Introduce `withTextDocument` wrapper
GlebSolovev Nov 15, 2024
8abe2eb
Fix `imm` submodule commit
GlebSolovev Nov 15, 2024
ea7805b
Introduce with-coq-lsp wrappers, dispose client correctly
GlebSolovev Nov 15, 2024
b0c20a4
Fix `SessionState` bugs, improve its code style
GlebSolovev Nov 16, 2024
da02a28
Fix coq-lsp usage in legacy benchmarks, refactor args
GlebSolovev Nov 16, 2024
1cc6ffd
Prevent legacy benchmarks from creating empty report on start
GlebSolovev Nov 16, 2024
649402d
Fix coq-lsp usage at parsing stage of benchmarks
GlebSolovev Nov 16, 2024
729153b
Document `CoqLspClient` invariants
GlebSolovev Nov 16, 2024
09f914a
Improve `CoqProofChecker` code style
GlebSolovev Nov 16, 2024
a129ca9
Fix bugs in internal signatures of benchmarks
GlebSolovev Nov 16, 2024
78dd974
Move abort utils into `core`
GlebSolovev Nov 16, 2024
7c3d5d6
Structure `extension` directory
GlebSolovev Nov 16, 2024
3a8ffc2
Support abort signal set-up in coq-lsp client wrappers
GlebSolovev Nov 17, 2024
9be1b07
Merge pull request #48 from JetBrains-Research/proof-checking-refacto…
K-dizzled Nov 26, 2024
28d4826
Update CHANGELOG with major updates from PR#48
K-dizzled Nov 26, 2024
43e676f
Small code-style issue refactor in `GrazieService`
K-dizzled Nov 26, 2024
0ec4392
Fix failing `Date` parsing tests on Mac
K-dizzled Nov 26, 2024
cff44b9
Fix issue with `systemMessage` for `gpt-o1`
K-dizzled Nov 26, 2024
0fc8d1d
Clarify abort controller usage at parsing stage of benchmarking
GlebSolovev Nov 26, 2024
e867f00
Support `initial_goal` being optional inside benchmarking
GlebSolovev Nov 26, 2024
7dcd906
Get `initial_goal` cached more often
GlebSolovev Nov 26, 2024
35130f8
Make `needsTheoremInitialGoals` required
GlebSolovev Nov 26, 2024
2eb29a1
Fix bug with different o1 class models name in Grazie
K-dizzled Nov 26, 2024
3aff82a
Update README with Related Papers and update future plans
K-dizzled Nov 26, 2024
477a0e8
Add TODO for initial-goals flag support inside benchmarking
GlebSolovev Nov 27, 2024
b1cd890
Fix multi-workspaces "pre" script
GlebSolovev Nov 27, 2024
86ec217
Provide `benchmark-test` task for faster development
GlebSolovev Nov 27, 2024
d5dda34
Add notes on `COQ_LSP_PATH` to benchmarking docs
GlebSolovev Nov 27, 2024
b4c0a3b
Update info on coq-lsp via nix usage
K-dizzled Nov 27, 2024
1048803
Merge pull request #49 from JetBrains-Research/benchmarking-hotfixes
K-dizzled Nov 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/build-and-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ on:

env:
coqlsp-path: "coq-lsp"
coqlsp-version: "0.1.9+8.19"
coqlsp-version: "0.2.2+8.19"
artifact-name: ubuntu-latest-build

jobs:
Expand Down Expand Up @@ -60,7 +60,7 @@ jobs:
env:
OPAMYES: true
run: |
opam install coq-lsp.0.1.9+8.19
opam install coq-lsp.0.2.2+8.19
eval $(opam env)

- name: Install Node.js
Expand Down
3 changes: 2 additions & 1 deletion .prettierrc.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,6 @@
"printWidth": 80,
"importOrder": ["^[./]*llm/(.*)$", "^[./]*coqLsp/(.*)$", "^[./]*core/(.*)$", "^../", "^./"],
"importOrderSeparation": true,
"importOrderSortSpecifiers": true
"importOrderSortSpecifiers": true,
"importOrderParserPlugins": ["typescript", "@trivago/prettier-plugin-sort-imports", "decorators-legacy"]
}
19 changes: 11 additions & 8 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,23 +1,26 @@
// Place your settings in this file to overwrite default and user settings.
{
"files.exclude": {
"**/*.vo": true,
"**/*.vok": true,
"**/*.vos": true,
"**/*.aux": true,
"**/*.glob": true,
"**/.git": true,
"**/.svn": true,
"**/.hg": true,
"**/CVS": true,
"**/.DS_Store": true,
"**/Thumbs.db": true,
"**/*.vo": true,
"**/*.vok": true,
"**/*.vos": true,
"**/*.aux": true,
"**/*.glob": true,
"out": false,
"**/*_cp_aux.v": true
"out": false
},
"search.exclude": {
"out": true // set this to false to include "out" folder in search results
},
// Turn off tsc task auto detection since we have the necessary tasks as npm scripts
"typescript.tsc.autoDetect": "off"
"typescript.tsc.autoDetect": "off",
"coq-lsp.check_only_on_request": false,
"eslint.options": {
"experimentalDecorators": true
}
}
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,21 @@
# Changelog

## 2.4.0

The version increment is minor, yet the changes are significant. The main focus of this release was to improve interaction with the `coq-lsp` server. We now completely got rid of temporary files in the `extension` part of `CoqPilot`. This was done thanks to [ejgallego](https://github.com/ejgallego) and his improvements in `coq-lsp`. Now checking of proofs is done via the `proof/goals` request via `command` parameter, which is much more reliable and faster. Now a single `lsp` server is created for the plugin until being explicitly closed, and it tracks changes in the file. If you are using `coq-lsp` plugin itself, it will help you to always keep track of whether file is checked or not. When `CoqPilot` is ran on a completely checked file, it will not bring any significant overhead apart from requests to the Completion Services.

### Public changes
- Added a toggle button to enable/disable the extension
- Using the toggle switch, user can now abort the proof generation process
- Significant performance improvements in proof checking that are especially easy to see for large files

### Internal changes
- Get rid of temporary files in the `extension` part of `CoqPilot`
- Refactor `CoqLspClient`
- Make computation of `initialGoal` for premises optional to avoid unnecessary requests to `coq-lsp`
- Introduce `with-coq-lsp` wrappers to encapsulate correct `CoqLspClient` disposal
- Introduce `CoqLspClient.withTextDocument(...)` wrapper to encapsulate opening and closing a document with `coq-lsp`

## 2.3.0

**A major upgrade of the benchmarking system**: At the moment, only a little **new** functionality is provided; moreover, the ability to run benchmarks on Tactician/CoqHammer is temporarily unavailable. However, it will soon be restored and improved. Excessive work has been done to make the benchmarking system more flexible, secure, robust, self-contained, and easy to use. Experiments via our benchmarking framework have been made more accessible than ever. The configurability and reliability of the pipeline have been improved drastically. In a nutshell, the main features of the improved benchmarking system include:
Expand Down
52 changes: 29 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,19 @@
# CoqPilot

![Version](https://img.shields.io/badge/version-v2.3.0-blue?style=flat-square)
# CoqPilot ![Version](https://img.shields.io/badge/version-v2.4.0-blue?style=flat-square)

*Authors:* Andrei Kozyrev, Gleb Solovev, Nikita Khramov, and Anton Podkopaev, [Programming Languages and Tools Lab](https://lp.jetbrains.com/research/plt_lab/) at JetBrains Research.

`CoqPilot` is a [Visual Studio Code](https://code.visualstudio.com/) extension that is designed to help automate writing of Coq proofs. It uses Large Language Models to generate multiple potential proofs and then uses [coq-lsp](https://github.com/ejgallego/coq-lsp) to typecheck them. It substitutes the proof in the editor only if a valid proof is found.

# Table of Contents

- 🚀 [CoqPilot Overview](#coqpilot)
- 🚀 [CoqPilot Overview](#coqpilot-version)
- 📋 [Requirements](#requirements)
- 📚 [Related papers](#related-papers)
- 🔍 [Brief Technical Overview](#brief-technical-overview)
- 💡 [Example Usage](#example-usage)
- 🛠 [Installation](#installation)
- ▶️ [Coq-LSP Installation](#coq-lsp-installation)
- 🤖 [Building Locally](#building-locally)
- ⚠️ [Important Information](#important)
- ⚙️ [Extension Settings](#extension-settings)
- 📐 [Guide to Model Configuration](#guide-to-model-configuration)
- 🎛 [How VSCode settings work](#how-vscode-settings-work)
Expand All @@ -30,7 +28,17 @@

## Requirements

* `coq-lsp` version `0.1.9+8.19` is currently required to run the extension.
* `coq-lsp` version `0.2.2+8.19` is currently required to run the extension.

## Related papers

- **[ASE Demo'24]** *CoqPilot, a plugin for LLM-based generation of proofs*
<br />
[[Paper](https://dl.acm.org/doi/10.1145/3691620.3695357) | [arXiv](https://arxiv.org/abs/2410.19605) | [Video (5min)](https://www.youtube.com/watch?v=oB1Lx-So9Lo) | [Video (10min)](https://www.youtube.com/watch?v=P-LHXf7vntM)]
- **[AITP'24 & CoqWS'24]** *CoqPilot, a plugin for LLM-based generation of proofs*
<br />
[[Extended Abstract](https://coq-workshop.gitlab.io/2024/files/EA2.pdf) | [CoqWS Slides](https://coq-workshop.gitlab.io/2024/files/SL2.pdf)]


## Brief technical overview

Expand Down Expand Up @@ -68,14 +76,26 @@ As soon as at least one valid proof is found, it is substituted in the editor an

To run the extension, you must install a `coq-lsp` server. Depending on the system used in your project, you should install it using `opam` or `nix`. A well-configured `nix` project should have the `coq-lsp` server installed as a dependency. To install `coq-lsp` using `opam`, you can use the following commands:
```bash
opam pin add coq-lsp 0.1.9+8.19
opam pin add coq-lsp 0.2.2+8.19
opam install coq-lsp
```
For more information on how to install `coq-lsp` please refer to [coq-lsp](https://github.com/ejgallego/coq-lsp).

Either way around, if the [coq-lsp](https://github.com/ejgallego/coq-lsp) extension works well and you can see the goals and theorems in the VSCode, then `CoqPilot` should work as well. However, using [coq-lsp](https://github.com/ejgallego/coq-lsp) as a plugin for Coq support is not mandatory for `CoqPilot` to work.

If your installation of `coq-lsp` is not in the default path, you can specify the path to the `coq-lsp` server in the settings using the `coqpilot.coqLspServerPath` setting. Default value should work well for most of the cases.
If your installation of `coq-lsp` is not in the default path, you can specify the path to the `coq-lsp` server in the settings using the `coqpilot.coqLspServerPath` setting. Default value should work well for `opam`.

**IMPORTANT**: If you are using `nix` in your project, make sure to **UPDATE** the path of the `coq-lsp` server in the settings. The default path is set to `coq-lsp`, which is the default path for `opam`. If you are using `nix`, you should run the following command from inside of the `nix-shell`:
```bash
which coq-lsp
```
And then copy the path to the `coq-lsp` server and paste it into the `coqpilot.coqLspServerPath` setting.

In the benchmark the same rule applies, but the path to the `coq-lsp` server should be set as an environment variable `COQ_LSP_PATH`:
```bash
export COQ_LSP_PATH=$(which coq-lsp)
```


### Building locally

Expand Down Expand Up @@ -110,19 +130,6 @@ To run specific tests, you can use `npm run test -- -g="grep pattern"`.

The extension's architecture overview is stored in the [ARCHITECTURE.md](https://github.com/JetBrains-Research/coqpilot/blob/refactor/ARCHITECTURE.md) file. It will be extended and updated as the project evolves. -->

## Important

CoqPilot generates aux files with `_cp_aux.v` suffix. Sometimes when generation fails with exception, it is possible that such file will not be deleted. When a project is open, extension shall show a window that asks if you want to add such files to the local project gitignore.

Moreover, this repository contains a script for your convenience that adds the format of such files to the global gitignore file on your system.
- Copy the [`set_gitignore.sh`](https://github.com/JetBrains-Research/coqpilot/blob/main/set_gitignore.sh) file to your computer. Then:
```bash
chmod +x set_gitignore.sh
./set_gitignore.sh
```
It will add the format of CoqPilot aux files to your global gitignore file on the system, so that even if CoqPilot forgets to clean files up, they will not be marked as new files in git.
Comment: Such files are not visible in the VSCode explorer, because plugin adds them to the `files.exclude` setting on startup.

## Extension Settings

This extension contributes the following settings:
Expand Down Expand Up @@ -253,7 +260,7 @@ The process of running the benchmark is not perfectly automated and we are worki
npm run benchmark
```

If you aim to run the benchmark with the use of `Tactician`, you should install the `imm` project and `Tactician` with `opam`. For `opam` installation instrustion, you can refer to the [Tactician website](https://coq-tactician.github.io/manual/installation/) and [imm repository](https://github.com/weakmemory/imm/tree/master?tab=readme-ov-file#installation-via-opam-supported-up-to-the-15-version-of-imm) (this part of the README file is a little outdated, but the installation process for `coq-8.19.0` is still the same).
If you aim to run the benchmark with the use of `Tactician`, you should install the `imm` project and `Tactician` with `opam`. For `opam` installation instruction, you can refer to the [Tactician website](https://coq-tactician.github.io/manual/installation/) and [imm repository](https://github.com/weakmemory/imm/tree/master?tab=readme-ov-file#installation-via-opam-supported-up-to-the-15-version-of-imm) (this part of the README file is a little outdated, but the installation process for `coq-8.19.0` is still the same).

## Integrating other solutions

Expand Down Expand Up @@ -296,7 +303,6 @@ Then add the `hammer.`, `sauto.` or any other tactic from `CoqHammer` to the pre
## Future plans

- Currently the user needs to manually enter the nix shell to get the correct environment for the benchmarks. We are working on automating this process.
- Get rid of the overhead due to hacks with coq-lsp and the aux files.

## Release Notes

Expand Down
1 change: 1 addition & 0 deletions etc/docs/benchmark/BENCHMARKING_FRAMEWORK_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ This experiment must be run within the workspace environment, which the user act
cd dataset/imm
nix-build # builds the 'imm' project
nix-shell # activates its environment
export COQ_LSP_PATH=$(which coq-lsp) # sets up the proper coq-lsp server path
cd ../..
npm run single-workspace-benchmark
```
Expand Down
58 changes: 26 additions & 32 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 7 additions & 3 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"url": "https://github.com/JetBrains-Research/coqpilot"
},
"publisher": "JetBrains-Research",
"version": "2.3.0",
"version": "2.4.0",
"engines": {
"vscode": "^1.82.0"
},
Expand Down Expand Up @@ -392,15 +392,18 @@
"pretest": "npm run compile && npm run lint",
"test": "npm run test-only",
"clean": "rm -rf out",
"rebuild": "npm run clean && npm run compile && npm run format",
"rebuild-test-resources": "cd ./src/test/resources/coqProj && make clean && make",
"preclean-test": "npm run clean && npm run rebuild-test-resources && npm run compile && npm run lint",
"clean-test": "npm run test-only",
"prebenchmark": "npm run preclean-test",
"premulti-workspace-benchmark": "npm run prebenchmark",
"premulti-workspaces-benchmark": "npm run prebenchmark",
"multi-workspaces-benchmark": "node ./out/benchmark/multiWorkspacesSetup.js",
"presingle-workspace-benchmark": "npm run prebenchmark",
"single-workspace-benchmark": "npm run test-executables-unsafe -- -g=\"[SourceExecutable] Single Workspace Benchmark\"",
"benchmark": "npm run single-workspace-benchmark",
"prebenchmark-test": "rm -rf benchmarksOutput",
"benchmark-test": "npm run benchmark",
"preteamcity-benchmark-setup": "npm run prebenchmark",
"teamcity-benchmark-setup": "node ./out/benchmark/teamCitySetup.js",
"teamcity-benchmark-agent": "npm run test-executables-unsafe -- -g=\"[SourceExecutable] Team City Benchmark Agent\"",
Expand Down Expand Up @@ -441,7 +444,7 @@
"event-source-polyfill": "^1.0.31",
"i": "^0.3.7",
"mocha-param": "^2.0.1",
"node-ipc": "^11.1.0",
"node-ipc": "^12.0.0",
"npm": "^10.4.0",
"openai": "^4.6.0",
"path": "^0.12.7",
Expand All @@ -450,6 +453,7 @@
"tiktoken": "^1.0.17",
"tmp": "^0.2.3",
"toml": "^3.0.0",
"ts-results": "^3.3.0",
"vscode-languageclient": "^9.0.1",
"yargs": "^17.7.2"
}
Expand Down
Loading