Major LLM services and UI improvement #23

GlebSolovev · 2024-05-22T05:10:19Z

Changes summary

⏳ Estimate time for LLM services to become available via logging proof-generation requests

Implement GenerationsLogger
- capable of thoroughly logging all LLM service generation requests in a human-readable way, which can be useful for debugging too;
- with support of censorship for sensitive data (for example, API keys).
Implement a default algorithm to estimate the time for LLM service to become available.

🕵️ Support proper logging and error handling inside `LLMService`-s

Introduce hierarchy for LLMService errors: LLMServiceError, ConfigurationError, RemoteConnectionError, GenerationFailedError.
Handle these errors properly inside LLMService implementation:
- throw different errors for different causes;
- repack errors where needed.
Implemented proper logging based on the errors thrown: with both EventsLogger and GenerationsLogger.
Introduce LLMServiceRequest to make data transfer coherent between different logic modules.
Design and implement interaction between LLMService-s and UI to report errors.

🏯 Rework `LLMService` arhcitecture

Introduce and implement LLMServiceInternal that:
- separates and hides the actual implementation of the interaction with a model;
- restricts the visibility of some methods and properties;
- provides a convenient way to support new services easily via wrapper methods.
Rework facade classes and methods of LLMService to make them both concise and powerful.
Design and support complete and powerful typing for LLMService classes based on recursive generics.
- That makes any further development involving LLMService-s safer and easier.
Document core LLMService architecture.

✅ Test everything

Test the LLM services module completely:
- including facade, core and utility functionality;
- via both unit and integration tests, with the use of stress testing and asynchronous behaviour;
- in total, about 150 new tests.
Improve test infrastructure in general:
- introduce a lot of utility functions and mocks;
- structure them properly.
Fix minor bugs and make minor improvements detected by tests.

🚀 Fix & improve CI

Fix the issue with building a test Coq project resource.
Thereby, make previously "non-CI" tests successfully pass on CI.
Set up CI debugging for future needs.
- Now it is possible to connect to the running GitHub-Actions worker by SSH and debug the environment. This mode activates only in case any task fails.
Enable launching CI manually.
Double the speed of CI by setting up caches for ocaml and opam dependencies.

🤝 Improve LLM services' parameters: their naming, transparency, description

Introduce modelId to distinguish a model identifier from the name of an OpenAI / Grazie model.
Rename newMessageMaxTokens to maxTokensToGenerate for greater clarity.
Add defaultChoices to ModelParams to make its resolution more transparent in the code.
Update the settings description, and make it more user-friendly.

🕊 Rework and significantly improve settings validation

Improve Ajv schemas for the parameters.
Refactor validation errors to one SettingsValidationError,
- which supports hints for the settings to open.
Move all input parameters validation to one place and make it coherent.
Add some general checks:
- input models have unique modelId-s;
- there are models to generate proofs with.
Design and implement a powerful parameters resolver framework that:
- makes it possible for LLM services to resolve input parameters (with overrides and defaults) and validate them;
- makes it possible for developers to specify parameters resolution in a declarative way;
- provides great compile-time type checks for developers;
- is well-documented.
Significantly improve messages shown to users:
- improve existing ones;
- support messages about parameter resolution: both failures and unexpected overrides.
Decipher Ajv errors and make them clear both in UI and for debugging.

💚🤖 Improve interaction with OpenAI

Handle OpenAI runtime errors and repacked them into clear configuration errors, shown to users.
Specify default token parameters for OpenAIService:
- gather values for various models from different sources;
- implement an algorithm to resolve maxTokensToGenerate appropriately.

🌿 Improve code quality

Split CoqPilot into smaller pieces.
Make throwing Error coherent.
Get rid of unnecessary any-s.
Output values coherently.
Make an order of LLM services coherent.
Improve linters' configuration.

K-dizzled · 2024-05-22T12:50:01Z

.github/workflows/coqpilot.yml

Need to add .nvmrc file and fix the node version

Yup, made it pretty much the same as you did in ai_agents_server and also used it on CI

K-dizzled · 2024-05-22T13:15:53Z

src/llm/llmServices/utils/generationsLogger/loggerRecord.ts

+        return text.replace("\n", "\\n");
+    }
+
+    static deserealizeFromString(rawRecord: string): [LoggerRecord, string] {


Why is the textual file format best for the logs?

It is a good question to be answered indeed, including the docs. I left a note at GenerationsLogger's docs about the possible performance overhead and how to resolve the issue if it arises in practice one day.

No worries for now though, neither tests nor I didn't experience any problems, on the opposite, only +vibes from working with human-readable logs during debugging 🕊

@K-dizzled

Inspired by @K-dizzled

GlebSolovev added 30 commits April 18, 2024 05:58

Implement RequestsLogger

3b241a4

Log requests & handle errors in llm services

dec8c1e

Take file logic out from RequestsLogger

dab15f4

Test RequestsLogger properly

6a2eef5

Implement LLMService-s availability UI via events

24cad6e

Move LLMService-s availability UI out of CoqPilot

49e4045

Move out code from CoqPilot into separate files

58e45ab

Implement estimateTimeToBecomeAvailable

0d5837f

Move out code from LLMService

ffeed59

Show availability estimation to user

3c51d90

Test and fix time utils

8492d2e

Test and fix RequestsLogger

c697666

Test and improve SyncFile

46837a1

Improve code style

1474ae1

Extend LLMIterator tests

fc7ea0d

Implement MockLLMService

f81cab6

Test parsing of UserModelParams from JSON

6b159bf

Better test and improve ChatTokensFitter

4d3209e

Update prefix of LLMService-s utils tests

b84058a

Test and improve chat builders

b3e7868

Move llm test utils to separate folder

7566ff9

Test parameters resolution

09f69a9

Dispose LLMServices

4b10fa4

Make LLMService code better to use and test

c3f0893

Implement PredefinedProofsService properly

2f79e1a

Support debug mode in LLMService-s

0334b3a

Improve, fix and document MockLLMService

77a93f3

Test generateFromChat and GeneratedProof

eab3d7a

Test generateProof with integration testing

787cd69

Test generateProof with async stress, improve MockLLMService

6e9b3ae

GlebSolovev added 13 commits May 20, 2024 10:58

Provide type checks for resolver's properties

90c0ded

Support ValidParamsResolverImpl type check

bab6377

Spread type checks to single parameter resolvers

a032df7

Test resolution of hidden wrong type

57b9252

Document parameter resolvers

c64d181

Improve OpenAI tokens params default resolution

6cb8fb9

Mention defaults in OpenAI settings description

b90100f

Merge branch 'v2.2.0-dev' into llm-services-improvement

2cb6632

Update benchmarks: params resolution, muted mode

9f6bb61

Test and fix proof extraction in GeneratedProof

85f85b2

Improve English in CHANGELOG.md, items honesty

ae0fefc

Present 2.2.0 changes

aa817fd

Trigger CI on development PR-s

0bc235a

GlebSolovev requested a review from K-dizzled May 22, 2024 08:46

GlebSolovev self-assigned this May 22, 2024

K-dizzled requested changes May 22, 2024

View reviewed changes

GlebSolovev force-pushed the llm-services-improvement branch from cc9087c to 0bc235a Compare May 22, 2024 16:41

GlebSolovev added 3 commits May 23, 2024 04:43

Comment on text format used by GenerationsLogger

1ca0983

Manage node version via .nvmrc

71270b1

Inspired by @K-dizzled

Cache opam on CI

873c667

JetBrains-Research deleted a comment from K-dizzled May 23, 2024

GlebSolovev added 5 commits May 23, 2024 07:38

Remove redundant CI step

eb83834

Enable dune-cache on CI

a3e32cd

Try to make opam cache work on CI

995d89c

Set back opam path to cache

97422f6

Set up opam caches on CI properly

8f9081a

GlebSolovev requested a review from K-dizzled May 23, 2024 07:15

K-dizzled approved these changes May 23, 2024

View reviewed changes

K-dizzled merged commit fdf5e0d into v2.2.0-dev May 23, 2024
2 checks passed

K-dizzled deleted the llm-services-improvement branch May 23, 2024 08:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major LLM services and UI improvement #23

Major LLM services and UI improvement #23

GlebSolovev commented May 22, 2024 •

edited

Loading

K-dizzled May 22, 2024

GlebSolovev May 23, 2024

K-dizzled May 22, 2024 •

edited

Loading

GlebSolovev May 23, 2024

Major LLM services and UI improvement #23

Major LLM services and UI improvement #23

Conversation

GlebSolovev commented May 22, 2024 • edited Loading

Changes summary

⏳ Estimate time for LLM services to become available via logging proof-generation requests

🕵️ Support proper logging and error handling inside LLMService-s

🏯 Rework LLMService arhcitecture

✅ Test everything

🚀 Fix & improve CI

🤝 Improve LLM services' parameters: their naming, transparency, description

🕊 Rework and significantly improve settings validation

💚🤖 Improve interaction with OpenAI

🌿 Improve code quality

K-dizzled May 22, 2024

Choose a reason for hiding this comment

GlebSolovev May 23, 2024

Choose a reason for hiding this comment

K-dizzled May 22, 2024 • edited Loading

Choose a reason for hiding this comment

GlebSolovev May 23, 2024

Choose a reason for hiding this comment

GlebSolovev commented May 22, 2024 •

edited

Loading

🕵️ Support proper logging and error handling inside `LLMService`-s

🏯 Rework `LLMService` arhcitecture

K-dizzled May 22, 2024 •

edited

Loading