-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Major LLM services and UI improvement #23
Merged
Merged
Changes from 136 commits
Commits
Show all changes
144 commits
Select commit
Hold shift + click to select a range
3b241a4
Implement RequestsLogger
GlebSolovev dec8c1e
Log requests & handle errors in llm services
GlebSolovev dab15f4
Take file logic out from RequestsLogger
GlebSolovev 6a2eef5
Test RequestsLogger properly
GlebSolovev 24cad6e
Implement LLMService-s availability UI via events
GlebSolovev 49e4045
Move LLMService-s availability UI out of CoqPilot
GlebSolovev 58e45ab
Move out code from CoqPilot into separate files
GlebSolovev 0d5837f
Implement `estimateTimeToBecomeAvailable`
GlebSolovev ffeed59
Move out code from LLMService
GlebSolovev 3c51d90
Show availability estimation to user
GlebSolovev 8492d2e
Test and fix time utils
GlebSolovev c697666
Test and fix RequestsLogger
GlebSolovev 46837a1
Test and improve SyncFile
GlebSolovev 1474ae1
Improve code style
GlebSolovev fc7ea0d
Extend LLMIterator tests
GlebSolovev f81cab6
Implement MockLLMService
GlebSolovev 6b159bf
Test parsing of UserModelParams from JSON
GlebSolovev 4d3209e
Better test and improve ChatTokensFitter
GlebSolovev b84058a
Update prefix of LLMService-s utils tests
GlebSolovev b3e7868
Test and improve chat builders
GlebSolovev 7566ff9
Move llm test utils to separate folder
GlebSolovev 09f69a9
Test parameters resolution
GlebSolovev 4b10fa4
Dispose LLMServices
GlebSolovev c3f0893
Make LLMService code better to use and test
GlebSolovev 2f79e1a
Implement PredefinedProofsService properly
GlebSolovev 0334b3a
Support debug mode in LLMService-s
GlebSolovev 77a93f3
Improve, fix and document MockLLMService
GlebSolovev eab3d7a
Test `generateFromChat` and `GeneratedProof`
GlebSolovev 787cd69
Test generateProof with integration testing
GlebSolovev 6e9b3ae
Test generateProof with async stress, improve MockLLMService
GlebSolovev 60b0cf9
Improve common test functions, test OpenAIService
GlebSolovev caf05e0
Refactor tests
GlebSolovev f6bcf3e
Test GrazieService, improve conditional tests
GlebSolovev 3416abf
Test LMStudioService
GlebSolovev edea619
Test PredefinedProofsService, improve code style
GlebSolovev a09ca1b
Test and improve default availability estimator
GlebSolovev 54e713b
Refactor test utils
GlebSolovev a0f38ab
Add npm test tasks
GlebSolovev 9608028
Fix no-successful-logs bug of availability estimator
GlebSolovev a036f6b
Improve availability warning message
GlebSolovev 0469170
Find directory for services logs
GlebSolovev 0b24ddd
Override availability estimator of predefined proofs
GlebSolovev 6bbdd25
Introduce LLMServiceInternal, improve LLMService ctor
GlebSolovev 22cd5a1
Introduce new error handling to LLMService, document it
GlebSolovev c299066
Add more wrappers, docs, refactor to LLMService package
GlebSolovev 430131b
Update tests according to LLMService changes
GlebSolovev 2e3c9c6
Cover new errors handling with tests, refactor
GlebSolovev 4804d09
Improve LLMService specification
GlebSolovev 5bce2cf
Distinguish `modelName` and `modelId`
GlebSolovev 936bbc8
Rename `newMessageMaxTokens` to `maxTokensToGenerate`
GlebSolovev 939bcb6
Update README and package.json with new settings
GlebSolovev 04361a4
Validate unique models' id-s
GlebSolovev ba265bc
Rework settings validation
GlebSolovev aee93a7
Validate models are present, refactor errors
GlebSolovev 7236080
Make order of services coherent
GlebSolovev 09f4af0
Improve settings description in package.json
GlebSolovev 80fcb11
Introduce, support, test `LLMServiceRequest`
GlebSolovev 72cd1fd
Update & refactor LLMServices events handling in UI
GlebSolovev af27f50
Improve `choices` validation
GlebSolovev 6b9cb0b
Link to setting name on configuration error
GlebSolovev 4c85d6a
Refactor "new Error" away
GlebSolovev 55cc569
Show error's content on service failed warning
GlebSolovev a3fc1b4
Improve error messages
GlebSolovev 6cac0ea
Improve `PredefinedProofs` params validation
GlebSolovev a4709c9
Validate `OpenAi` params, test it
GlebSolovev bb78922
Specify thrown error in tests
GlebSolovev 8a42d25
Fix "finally with async" bug
GlebSolovev b9ee6b6
Log and display Coqpilot error better
GlebSolovev e94a273
Implement draft ParamsResolver
GlebSolovev f9246cc
Implement ModelParamsResolver-s
GlebSolovev f1e1e98
Make `LLMService` generic, use `ParamsResolver`
GlebSolovev fb302b9
Remove redundant `ConfigurationError`-s
GlebSolovev ad51635
Support no-default-value help message
GlebSolovev 90cd8a3
Solve `choices` problem: add them into `ModelParams`
GlebSolovev 7d0192c
Move params resolution to UI, show feedback to user
GlebSolovev 1448460
Support parameter insertion in `ParamsResolver`
GlebSolovev 2637688
Improve `LLMService` typing: `GeneratedProofImpl`
GlebSolovev a487595
Parameterize with `LLMServiceInternalType`
GlebSolovev 0ed8d5d
Return `GeneratedProof` from `LLMService` facade methods
GlebSolovev a09f4e1
Improve `BasicModelParamsResolver`
GlebSolovev 037a180
Refactor and improve chat-factory tests
GlebSolovev 9fcb68c
Complete `LLMService` typing with recursive generics
GlebSolovev 035468c
Fix and refactor params resolvers
GlebSolovev d0818c0
Design and implement powerful resolvers interfaces
GlebSolovev f1f95ae
Fix `tmp.dirSync` bug
GlebSolovev f2b2fc7
Refactor params resolvers, note their critical issues
GlebSolovev 7e3c7aa
Fix models params resolvers
GlebSolovev 7fbef4c
Update & refactor test utils according to latest changes
GlebSolovev 6083980
Update & refactor tests according to latest changes
GlebSolovev 6f63ef9
Test parameters resolution
GlebSolovev 1b891d7
Rework LLM-services' tests, expand with configuration ones
GlebSolovev 8e56486
Improve params-resolution UI messages
GlebSolovev cfd0beb
Fix resolvers by specifying schemas
GlebSolovev 7456d6e
Move resolvers to services, test valid params
GlebSolovev 7712fe4
Test params are read correctly
GlebSolovev a97af3b
Fix empty lists failing LoggerRecord bug
GlebSolovev 7bc77ae
Support censorship in GenerationsLogger
GlebSolovev 1928e2e
Refactor settings for `GenerationsLogger`
GlebSolovev 9e358d4
Tweak LLM services tests timeouts
GlebSolovev 0c256aa
Support context length error in OpenAiService
GlebSolovev 0e37775
Support connection error in OpenAiService
GlebSolovev 42ec5c7
Support and use rich messages in resolvers
GlebSolovev 7f3787d
Add default tokens params to OpenAiService
GlebSolovev b20efb2
Fix messages, defaults in settings
GlebSolovev 27ff5f7
Make code safer: get rid of any-s
GlebSolovev ffa8ec4
Move `LLMServiceInternal` to a separate file
GlebSolovev c4cec91
Improve config parsers names
GlebSolovev 88e8286
Improve eslint naming-convention
GlebSolovev d08cd33
Make `estimatedTokens` rich and clear
GlebSolovev ee4131d
Introduce and support `RemoteConnectionError`
GlebSolovev c7b33eb
Setup debugging for CI
GlebSolovev 21fbcc5
Make all tests run on CI
GlebSolovev a3a289c
Make possible to run CI manually
GlebSolovev 9adb81a
Update coqpilot version
GlebSolovev 384f2bb
Fix "override the same value" bug
GlebSolovev 8dacf40
Introduce & use stringify util for consistent messages
GlebSolovev f30d7c9
Show overriden message only if user specified value
GlebSolovev 8b10d2f
Improve top-level errors handling
GlebSolovev 9851280
Forbid additional properties in schemas
GlebSolovev bb2f5d3
Refactor editor messages: move all in one place
GlebSolovev 23c3ebd
Handle errors from Ajv
GlebSolovev e53e388
Minor package.json updates
GlebSolovev d1c9b84
Improve tests for user settings schemas
GlebSolovev 90c0ded
Provide type checks for resolver's properties
GlebSolovev bab6377
Support `ValidParamsResolverImpl` type check
GlebSolovev a032df7
Spread type checks to single parameter resolvers
GlebSolovev 57b9252
Test resolution of hidden wrong type
GlebSolovev c64d181
Document parameter resolvers
GlebSolovev 6cb8fb9
Improve OpenAI tokens params default resolution
GlebSolovev b90100f
Mention defaults in OpenAI settings description
GlebSolovev 2cb6632
Merge branch 'v2.2.0-dev' into llm-services-improvement
GlebSolovev 9f6bb61
Update benchmarks: params resolution, muted mode
GlebSolovev 85f85b2
Test and fix proof extraction in `GeneratedProof`
GlebSolovev ae0fefc
Improve English in CHANGELOG.md, items honesty
GlebSolovev aa817fd
Present 2.2.0 changes
GlebSolovev 0bc235a
Trigger CI on development PR-s
GlebSolovev 1ca0983
Comment on text format used by `GenerationsLogger`
GlebSolovev 71270b1
Manage node version via `.nvmrc`
GlebSolovev 873c667
Cache opam on CI
GlebSolovev eb83834
Remove redundant CI step
GlebSolovev a3e32cd
Enable `dune-cache` on CI
GlebSolovev 995d89c
Try to make opam cache work on CI
GlebSolovev 97422f6
Set back opam path to cache
GlebSolovev 8f9081a
Set up opam caches on CI properly
GlebSolovev File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,95 +1,146 @@ | ||
# Change Log | ||
# Changelog | ||
|
||
## 2.2.0 | ||
|
||
### Public changes | ||
|
||
- Support time estimation for LLM services to become available after failure via logging proof-generation requests. This information is shown to the user. | ||
- Set up interaction between `LLMService`-s and UI to report errors that happened during proof-generation. | ||
- Improve LLM services' parameters: their naming, transparency, and description. | ||
- Introduce `modelId` to distinguish a model identifier from the name of an OpenAI / Grazie model. | ||
- Rename `newMessageMaxTokens` to `maxTokensToGenerate` for greater clarity. | ||
- Update the settings description, and make it more user-friendly. | ||
- Significantly improve settings validation. | ||
- Use parameters resolver framework to resolve parameters (with overrides and defaults) and validate them. | ||
- Support messages about parameter resolution: both failures and unexpected overrides. | ||
- Clarify existing error messages. | ||
- Add some general checks: input models have unique `modelId`-s, there are models to generate proofs with. | ||
- Improve interaction with OpenAI. | ||
- Notify the user of configuration errors (invalid model name, incorrect API key, maximum context length exceeded) and connection errors. | ||
- Support resolution of `tokensLimit` and `maxTokensToGenerate` with recommended defaults for known models. | ||
- Fix minor bugs and make minor improvements detected by thorough testing. | ||
|
||
### Internal changes | ||
|
||
- Rework and document LLMService architecture: `LLMServiceInternal`, better facades, powerful typing. | ||
- Introduce hierarchy for LLMService errors. Support proper logging and error handling inside `LLMService`-s. | ||
- Rework settings validation. | ||
- Refactor `SettingsValidationError`, move all input parameters validation to one place and make it coherent. | ||
- Design and implement a powerful and well-documented parameters resolver framework. | ||
|
||
### Testing infrastructure changes | ||
|
||
- Test the LLM Services module thoroughly. | ||
- Improve test infrastructure in general by introducing and structuring utils. | ||
- Fix the issue with building test resources on CI. Set up CI debugging, and enable launching CI manually. | ||
|
||
## 2.1.0 | ||
|
||
### 2.1.0 | ||
Major: | ||
- Create a (still in development and improvement) benchmarking system. A guide on how to use it is in the README. | ||
- Conduct an experiment on the performance of different LLMs, using the developed infrastructure. Benchmarking report is located in the [docs folder](etc/docs/benchmarking_report01.md). | ||
- Correctly handle and display settings which occur when the user settings are not correctly set. | ||
- Conduct an experiment on the performance of different LLMs, using the developed infrastructure. The benchmarking report is located in the [docs folder](etc/docs/benchmarking_report01.md). | ||
|
||
Minor: | ||
- Set order of contributed settings. | ||
- Set the order of contributed settings. | ||
- Add a comprehensive user settings guide to the README. | ||
- Fix issue with Grazie service not being able to correctly accept coq ligatures. | ||
- Fix issue that occured when generated proof contained `Proof using {...}.` construct. | ||
- Fix the issue with Grazie service not being able to correctly accept coq ligatures. | ||
- Fix the issue that occurred when the generated proof contained the `Proof using {...}.` construct. | ||
|
||
## 2.0.0 | ||
|
||
### 2.0.0 | ||
- Added multiple strategies for ranking theorems from the working file. As LLM context window is limited, we sometimes should somehow choose a subset of theorems we want to provide as references to the LLM. Thus, we have made a few strategies for ranking theorems. Now there are only 2 of them, but there are more to come. Now we have a strategy that randomly picks theorems, and also the one that ranks them depending on the distance from the hole. | ||
- Added multiple strategies for ranking theorems from the working file. As the LLM context window is limited, we sometimes should somehow choose a subset of theorems we want to provide as references to the LLM. Thus, we have made a few strategies for ranking theorems. Now there are only 2 of them, but there are more to come. Now we have a strategy that randomly picks theorems, and also the one that ranks them depending on the distance from the hole. | ||
- Now different holes are solved in parallel. This is a huge improvement in terms of performance. | ||
- Implemented multi-round fixing procedure for the proofs from the LLM. It can now be configured in the settings. One can set the amount of attempts for the consequtive proof fixing with compiler feedback. | ||
- Implemented multi-round fixing procedure for the proofs from the LLM. It can now be configured in the settings. One can set the number of attempts for the consecutive proof fixing with compiler feedback. | ||
- Added an opportunity to use LM Studio as a language model provider. | ||
- More accurate token count. Tiktoken is now used for open-ai models. | ||
- Different logging levels now supported. | ||
- More accurate token count. Tiktoken is now used for OpenAI models. | ||
- Different logging levels are now supported. | ||
- The LLM iterator now supports adding a sequence of models for each service. This brings freedom to the user to experiment with different model parameters. | ||
- Now temperature, prompt, and other parameters are configurable in the settings. | ||
|
||
### 1.9.0 | ||
- Huge refactoring done. Project re organized. | ||
## 1.9.0 | ||
|
||
- Huge refactoring is done. Project reorganized. | ||
|
||
## 1.5.3 | ||
|
||
### 1.5.3 | ||
- Fix Grazie service request headers and endpoint. | ||
|
||
### 1.5.2 | ||
## 1.5.2 | ||
|
||
- Fix issue with double document symbol provider registration (@Alizter, [#9](https://github.com/JetBrains-Research/coqpilot/issues/9)) | ||
|
||
### 1.5.1 | ||
- Add support of the Grazie platform as an llm provider. | ||
## 1.5.1 | ||
|
||
- Add support for the Grazie platform as an LLM provider. | ||
|
||
### 1.5.0 | ||
- Now when the hole can be solved by a single tactic solver, using predefined tactics, gpt will not be called, LLMs are now fetched consequently. | ||
## 1.5.0 | ||
|
||
- Now when the hole can be solved by a single tactic solver, using predefined tactics, OpenAI and Grazie will not be called, LLMs are now fetched consequently. | ||
- Parallel hole completion is unfortunately postponed due to the implementation complexity. Yet, hopefully, will still be implemented in milestone `2.0.0`. | ||
|
||
### 1.4.6 | ||
- Fix issue with plugin breaking after parsing a file containing theorem without `Proof.` keyword. | ||
## 1.4.6 | ||
|
||
- Fix the issue with the plugin breaking after parsing a file containing theorem without the `Proof.` keyword. | ||
|
||
## 1.4.5 | ||
|
||
### 1.4.5 | ||
- Fix formatting issues when inserting the proof into the editor. | ||
|
||
### 1.4.4 | ||
- Do not require a theorem to be `Admitted.` for coqpilot to prove holes in it. | ||
- Correctly parse theorems that are declared with `Definition` keyword. | ||
## 1.4.4 | ||
|
||
- Do not require a theorem to be `Admitted.` for CoqPilot to prove holes in it. | ||
- Correctly parse theorems that are declared with the `Definition` keyword. | ||
|
||
## 1.4.3 | ||
|
||
- Tiny patch with the shuffling of the hole array. | ||
|
||
### 1.4.3 | ||
- Tiny patch with shuffling of the hole array. | ||
## 1.4.2 | ||
|
||
### 1.4.2 | ||
- Now no need to add dot in the end of the tactic, when configuring a single tactic solver. | ||
- Now no need to add a dot at the end of the tactic, when configuring a single tactic solver. | ||
- Automatic reload settings on change in the settings file. Not all settings are reloaded automatically, | ||
but the most ones are. The ones that are not automatically reloaded: `useGpt`, `coqLspPath`, `parseFileOnInit`. | ||
- Added command that solves admits in selected region. Also added that command to the context menu (right click in the editor). | ||
but most ones are. The ones that are not automatically reloaded: `useGpt`, `coqLspPath`, `parseFileOnInit`. | ||
- Added a command that solves admits in a selected region. Also added that command to the context menu (right-click in the editor). | ||
- Fix toggle extension. | ||
|
||
### 1.4.1 | ||
## 1.4.1 | ||
|
||
- Add a possibility to configure a single tactic solver. | ||
|
||
### 1.4.0 | ||
- Add command to solve all admitted holes in the file. | ||
## 1.4.0 | ||
|
||
- Add a command to solve all admitted holes in the file. | ||
- Fixing bugs with coq-lsp behavior. | ||
|
||
### 1.3.1 | ||
## 1.3.1 | ||
|
||
- Test coverage increased. | ||
- Refactoring client and ProofView. | ||
- Refactoring client and ProofView. | ||
- Set up CI. | ||
|
||
### 1.3.0 | ||
## 1.3.0 | ||
|
||
- Fix bug while parsing regarding the updated Fleche doc structure in coq-lsp 0.1.7. | ||
- When GPT generated a response containing three consecutive backtick symbols it tended to | ||
break the typecheking of the proof. Now solved. | ||
- Speed up the file parsing process. | ||
|
||
### 1.2.1 | ||
## 1.2.1 | ||
|
||
- Add clearing of aux file after parsing. | ||
|
||
### 1.2.0 | ||
- Fix error with llm silently failing. Now everything that comes from llm that is not handled inside plugin is presented to user as a message (i.e. incorrect apiKey exception). | ||
- Fix toggle button. | ||
- Fix diagnostics being shown to non coq-lsp plugin coq users. | ||
- Add output stream for the logs in vscode output panel. | ||
## 1.2.0 | ||
|
||
- Fix error with llm silently failing. Now everything that comes from LLM that is not handled inside the plugin is presented to the user as a message (i.e. incorrect apiKey exception). | ||
- Fix the toggle button. | ||
- Fix diagnostics being shown to non-`coq-lsp` plugin coq users. | ||
- Add output stream for the logs in the vscode output panel. | ||
|
||
### 1.1.0 | ||
## 1.1.0 | ||
|
||
Now proof generation could be run in any position inside the theorem. There is no need to retake file snapshot after each significant file change. | ||
More communication with `coq-lsp` is added. Saperate package `coqlsp-client` no longer used. | ||
Now proof generation could be run in any position inside the theorem. There is no need to retake a file snapshot after each significant file change. | ||
More communication with `coq-lsp` is added. The separate package `coqlsp-client` is no longer used. | ||
|
||
### 0.0.1 | ||
## 0.0.1 | ||
|
||
Initial release of coqpilot. | ||
The initial release of CoqPilot. |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add .nvmrc file and fix the node version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, made it pretty much the same as you did in
ai_agents_server
and also used it on CI