Releases · Aleph-Alpha/intelligence-layer-sdk

12 Sep 12:15

NiklasKoehneckeAA

v6.0.0

d3c6b41

v6.0.0

6.0.0

Features

Remove cap for max_concurrency in LimitedConcurrencyClient.
Introduce abstract LanguageModel class to integrate with LLMs from any API
- Every LanguageModel supports echo to retrieve log probs for an expected completion given a prompt
Introduce abstract ChatModel class to integrate with chat models from any API
- Introducing Pharia1ChatModel for usage with pharia-1 models.
- Introducing Llama3ChatModel for usage with llama models.
Upgrade ArgillaWrapperClient to use Argilla v2.x
(Beta) Add DataClient and StudioDatasetRepository as connectors to Studio for submitting data.
Add the optional argument generate_highlights to MultiChunkQa, RetrieverBasedQa and SingleChunkQa. This makes it possible to disable highlighting for performance reasons.

Fixes

Increase number of returned log_probs in EloQaEvaluationLogic to avoid missing a valid answer

Deprecations

Removed DefaultArgillaClient
Deprecated Llama2InstructModel

Breaking Changes

We needed to upgrade argilla-server image version from argilla-server:v1.26.0 to argilla-server:v1.29.0 to maintain compatibility.
- Note: We also updated our elasticsearch argilla backend to 8.12.2
Full Changelog: v5.1.0...v6.0.0

Assets 2

15 Aug 08:21

NiklasKoehneckeAA

v5.1.0

84a5b41

v5.1.0

5.1.0

Features

Updated DocumentIndexClient with support for metadata filters.
- Add documentation for filtering to document_index.ipynb.
Add StudioClient as a connector for submitting traces.
You can now specify a chunk_overlap when creating an index in the Document Index.
Add support for monitoring progress in the document index connector when embedding documents.

Fixes

TaskSpan now properly sets its status to Error on crash.

Deprecations

Deprecate old Trace Viewer as the new StudioClient replaces it. This affects Tracer.submit_to_trace_viewer.

Full Changelog: v5.0.3...v5.1.0

Assets 2

22 Jul 07:36

MerlinKallenbornAA

v5.0.3

11fdd41

v5.0.3

5.03.

Fixes

fix: Corrected docstrings for 'calculate_bleu'-function

Full Changelog: v5.0.2...v5.0.3

Assets 2

09 Jul 08:49

maxhammeralephalpha

v5.0.2

adde8c0

5.0.2

Fixes

Reverted a bug introduced in MultipleChunkRetrieverQa text highlighting.

Full Changelog: v5.0.1...v5.0.2

Assets 2

01 Jul 09:34

maxhammeralephalpha

v5.0.1

64af939

5.0.1

Fixes

Serialization and deserialization of ExportedSpan and its attributes now works as expected.
PromptTemplate.to_rich_prompt now always returns an empty list for prompt ranges that are empty.
SingleChunkQa no longer crashes if given an empty input and a specific prompt template. This did not affect users who used models provided in core.
Added default values for labels and metadata for EvaluationOverview and RunOverview
In the MultipleChunkRetrieverQa, text-highlight start and end points are now restricted to within the text length of the respective chunk.

Full Changelog: v5.0.0...v5.0.1

Assets 2

25 Jun 10:12

MerlinKallenbornAA

v5.0.0

2ad73f4

v5.0.0

5.0.0

Breaking Changes

RunRepository.example_output now returns None and prints a warning when there is no associated record for the given run_id instead of raising a ValueError.
RunRepository.example_outputs now returns an empty list and prints a warning when there is no associated record for the given run_id instead of raising a ValueError.

Features

Runner.run_dataset can now be resumed after failure by setting the resume_from_recovery_data flag to True and calling Runner.run_dataset again.
For InMemoryRunRepository based Runners this is limited to runs that failed with an exception that did not crash the whole process/kernel.
For FileRunRepository based Runners even runs that crashed the whole process can be resumed.
DatasetRepository.examples now accepts an optional parameter examples_to_skip to enable skipping of Examples with the provided IDs.
Add how_to_resume_a_run_after_a_crash notebook.

Fixes

Remove unnecessary dependencies from IL
Added default values for labels and metadata for PartialEvaluationOverview

Full Changelog: v4.1.0...v5.0.0

Assets 2

17 Jun 12:13

MerlinKallenbornAA

v4.1.0

57e7a54

v4.1.0

4.1.0

New Features

Add eot_token property to ControlModel and derived classes (LuminousControlModel, Llama2InstructModel and Llama3InstructModel) and let PromptBasedClassify use this property instead of a hardcoded string.
Introduce a new argilla client ArgillaWrapperClient. This uses the argilla package as a connection to argilla and supports all question types that argilla supports in their FeedbackDataset. This includes text and yes/no questions. For more information about the questions, check their official documentation.
- Changes to switch:
  - DefaultArgillaClient -> ArgillaWrapperClient
  - Question -> argilla.RatingQuestion, options -> values and it takes only a list
  - Field -> argilla.TextField
Add description parameter to Aggregator.aggregate_evaluation to allow individual descriptions without the need to create a new Aggregator. This was missing from the previous release.
Add optional field metadata to Dataset, RunOverview, EvaluationOverview and AggregationOverview
- Update parameter_optimization.ipynb to demonstrate usage of metadata****
Add optional field label to Dataset, RunOverview, EvaluationOverview and AggregationOverview
Add unwrap_metadata flag to aggregation_overviews_to_pandas to enable inclusion of metadata in pandas export. Defaults to True.

Fixes

Reinitializing different AlephAlphaModel instances and retrieving their tokenizer should now consume a lot less memory.
Evaluations now raise errors if ids of examples and outputs no longer match. If this happens, continuing the evaluation would only produce incorrect results.
Performing evaluations on runs with a different number of outputs now raises errors. Continuing the evaluation in this case would only lead to an inconsistent state.

Full Changelog: v4.0.1...v4.1.0

Assets 2

11 Jun 09:32

FelixFehse

v4.0.1

5c82bd4

v4.0.1

Breaking Changes

Remove the Trace class, as it was no longer used.
Renamed example_trace to example_tracer and changed return type to Optional[Tracer].
Renamed example_tracer to create_tracer_for_example.
Replaced langdetect with lingua as language detection tool. This mean that old thresholds for detection might need to be adapted.

New Features

Lineages now contain Tracer for individual Outputs.
convert_to_pandas_data_frame now also creates a column containing the Tracers.
run_dataset now has a flag trace_examples_individually to create Tracers for each example. Defaults to True.
Added optional metadata field to Example.

Fixes

ControlModels throw a warning instead of an error in case a not-recommended model is selected.
The LimitedConcurrencyClient.max_concurrency is now capped at 10, which is its default, as the underlying aleph_alpha_client does not support more currently.
ExpandChunk now works properly if the chunk of interest is not at the beginning of a very large document. As a consequence, MultipleChunkRetrieverQa now works better with larger documents and should return fewer None answers.

Full Changelog: v3.0.0...v4.0.1

Assets 2

04 Jun 12:55

SebastianNiehusAA

v3.0.0

3d9f453

v3.0.0

3.0.0

Breaking Changes

We removed the trace_id as a concept from various tracing-related functions and moved them to a context. If you did not directly use the trace_id there is nothing to change.
- Task.run no longer takes a trace id. This was a largely unused feature, and we revamped the trace ids for the traces.
- Creating Span, TaskSpan or logs no longer takes trace_id. This is handled by the spans themselves, who now have a context that identifies them.
  - Span.id is therefore also removed. This can be accessed by span.context.trace_id, but has a different type.
- The OpenTelemetryTracer no longer logs a custom trace_id into the attributes. Use the existing ids from its context instead.
- Accessing a single trace from a PersistentTracer.trace() is no longer supported, as the user does not have access to the trace_id anyway. The function is now called traces and returns all available traces for a tracer.
InMemoryTracer and derivatives are no longer pydantic.BaseModel. Use the export_for_viewing function to export a serializable representation of the trace.
We updated the graders to support python 3.12 and moved away from nltk-package:
- BleuGrader now uses sacrebleu-package.
- RougeGrader now uses the rouge_score-package.
When using the ArgillaEvaluator, attempting to submit to a dataset, which already exists, will no longer work append to the dataset. This makes it more in-line with other evaluation concepts.
- Instead of appending to an active argilla dataset, you now need to create a new dataset, retrieve it and then finally combine both datasets in the aggregation step.
- The ArgillaClient now has methods create_dataset for less fault-ignoring dataset creation and add_records for performant uploads.

New Features

Add support for Python 3.12
Add skip_example_on_any_failure flag to evaluate_runs (defaults to True). This allows to configure if you want to keep an example for evaluation, even if it failed for some run.
Add how_to_implement_incremental_evaluation.
Add export_for_viewing to tracers to be able to export traces in a unified format similar to OpenTelemetry.
- This is not supported for the OpenTelemetryTracer because of technical incompatibilities.
All exported spans now contain the status of the span.
Add description parameter to Evaluator.evaluate_runs and Runner.run_dataset to allow individual descriptions without the need to create a new Evaluator or Runner.
All models raise an error during initialization if an incompatible name is passed, instead of only when they are used.
Add aggregation_overviews_to_pandas function to allow for easier comparison of multiple aggregation overviews.
Add parameter_optimization.ipynb notebook to demonstrate the optimization of tasks by comparing different parameter combinations.
Add convert_file_for_viewing in the FileTracer to convert the trace file format to the new (OpenTelemetry style) format and save as a new file.
All tracers can now call submit_to_trace_viewer to send the trace to the Trace Viewer.

Fixes

The document index client now correctly URL-encodes document names in its queries.
The ArgillaEvaluator not properly supports dataset_name.
Update outdated how_to_human_evaluation_via_argilla.ipynb.
Fix bug in FileSystemBasedRepository causing spurious mkdir failure if the file actually exists.
Update broken README links to Read The Docs.
Fix a broken multi-label classify example in the evaluation tutorial.

Full Changelog: v2.0.0...v3.0.0

Assets 2

21 May 09:21

MerlinKallenbornAA

v2.0.0

a5ab8e5

v2.0.0

2.0.0

Breaking Changes

Changed the behavior of IncrementalEvaluator::do_evaluate such that it now sends all SuccessfulExampleOutputs to do_incremental_evaluate instead of only the new SuccessfulExampleOutputs.

New Features

Add generic EloEvaluationLogic class for implementation of Elo evaluation use cases.
Add EloQaEvaluationLogic for Elo evaluation of QA runs, with optional later addition of more runs to an existing evaluation.
Add EloAggregationAdapter class to simplify using the ComparisonEvaluationAggregationLogic for different Elo use cases.
Add elo_qa_eval tutorial notebook describing the use of an (incremental) Elo evaluation use case for QA models.
Add how_to_implement_elo_evaluations how-to as skeleton for implementing Elo evaluation cases

Fixes

ExpandChunks-task is now fast even for very large documents

Full Changelog: v1.2.0...v2.0.0

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

6.0.0

Features

Fixes

Deprecations

Breaking Changes

5.1.0

Features

Fixes

Deprecations

5.03.

Fixes

Fixes

5.0.1

Fixes

5.0.0

Breaking Changes

Features

Fixes

4.1.0

New Features

Fixes

Breaking Changes

New Features

Fixes

3.0.0

Breaking Changes

New Features

Fixes

2.0.0

Breaking Changes

New Features

Fixes

Releases: Aleph-Alpha/intelligence-layer-sdk

v6.0.0

6.0.0

Features

Fixes

Deprecations

Breaking Changes

v5.1.0

5.1.0

Features

Fixes

Deprecations

v5.0.3

5.03.

Fixes

5.0.2

Fixes

5.0.1

5.0.1

Fixes

v5.0.0

5.0.0

Breaking Changes

Features

Fixes

v4.1.0

4.1.0

New Features

Fixes

v4.0.1

Breaking Changes

New Features

Fixes

v3.0.0

3.0.0

Breaking Changes

New Features

Fixes

v2.0.0

2.0.0

Breaking Changes

New Features

Fixes