Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TLM documentation #216

Merged
merged 13 commits into from
Apr 16, 2024
37 changes: 24 additions & 13 deletions cleanlab_studio/studio/studio.py
Original file line number Diff line number Diff line change
Expand Up @@ -391,28 +391,39 @@ def TLM(
timeout: Optional[float] = None,
verbose: Optional[bool] = None,
) -> trustworthy_language_model.TLM:
"""Gets a configured instance of Trustworthy Language Model (TLM).
"""Instantiates a configured Trustworthy Language Model (TLM) instance.

The returned TLM object can then be used as a drop-in replacement for an LLM, for estimating trustworthiness scores for LLM prompt/response pairs, and more. See the documentation for the [TLM](../trustworthy_language_model#class-TLM) class for more on what you can do with TLM.
The TLM object can be used as a drop-in replacement for an LLM, or for estimating trustworthiness scores for arbitrary text prompt/response pairs, and more (see the [TLM documentation](../trustworthy_language_model#class-TLM)).
jwmueller marked this conversation as resolved.
Show resolved Hide resolved

For advanced use cases, TLM supports a number of configuration options. The documentation below summarizes the options, and the [TLM tutorial](/tutorials/tlm) explains the tradeoffs in more detail.
For advanced use, TLM offers configuration options. The documentation below summarizes these options, and more details are explained in the [TLM tutorial](/tutorials/tlm).

Args:
quality_preset (TLMQualityPreset): quality preset to use for TLM queries, which will determine the quality of the output responses and trustworthiness scores.
Supported presets include "best", "high", "medium", "low", "base".
The "best" and "high" presets will improve the LLM responses themselves, with "best" also returning the most reliable trustworthiness scores.
The "medium" and "low" presets will return standard LLM responses along with associated confidence scores,
with "medium" producing more reliable trustworthiness scores than low.
The "base" preset will not return any confidence score, just a standard LLM output response, this option is similar to using your favorite LLM API.
Higher presets have increased runtime and cost.
quality_preset (TLMQualityPreset): An optional preset to control the quality of TLM responses and trustworthiness scores vs. runtimes/costs.
TLMQualityPreset is a string specifying one of the supported presets, including "best", "high", "medium", "low", "base".

The "best" and "high" presets improve the LLM responses themselves,
with "best" returning more reliable trustworthiness scores than "high".
The "medium" and "low" presets return standard LLM responses along with associated trustworthiness scores,
jwmueller marked this conversation as resolved.
Show resolved Hide resolved
with "medium" producing more reliable trustworthiness scores than low.
The "base" preset will not return any trustworthiness score, just a standard LLM response, and is similar to directly using your favorite LLM API.

Higher presets have increased runtime and cost (and may internally consume more tokens).
Reduce your preset if you see token-limit errors.
Details about each present are in the documentation for [TLMOptions](../trustworthy_language_model#class-tlmoptions).
Avoid using "best" or "high" presets if you primarily want to get trustworthiness scores, and are less concerned with improving LLM responses.
These presets have higher runtime/cost and are optimized to return more accurate LLM outputs, but not necessarily more reliable trustworthiness scores.

options (TLMOptions, optional): a typed dict of advanced configuration options.
Options that can be passed in include "model", "max_tokens", "num_candidate_responses", "num_consistency_samples", "use_self_reflection".
Avaialable options (keys in this dict) include "model", "max_tokens", "num_candidate_responses", "num_consistency_samples", "use_self_reflection".
For more details about the options, see the documentation for [TLMOptions](../trustworthy_language_model#class-tlmoptions).
If specified, these override any settings from the choice of `quality_preset`.

timeout (float, optional): timeout (in seconds) to apply to each method call. If a result is not produced within the timeout, a TimeoutError will be raised. Defaults to None, which does not apply a timeout.
timeout (float, optional): timeout (in seconds) to apply to each TLM prompt.
If a batch of data is passed in, the timeout will be applied to each individual item in the batch.
If a result is not produced within the timeout, a TimeoutError will be raised. Defaults to None, which does not apply a timeout.

verbose (bool, optional): whether to run in verbose mode, i.e., whether to show a tqdm progress bar when TLM is prompted with batches of data. If None, this will be determined automatically based on whether the code is running in an interactive environment such as a notebook.
verbose (bool, optional): whether to print outputs during execution, i.e., whether to show a progress bar when TLM is prompted with batches of data.
If None, this will be determined automatically based on whether the code is running in an interactive environment such as a Jupyter notebook.

Returns:
TLM: the [Trustworthy Language Model](../trustworthy_language_model#class-tlm) object
Expand Down
Loading
Loading