Skip to content

Commit

Permalink
feat: 985 feature argillalabeller task (#986)
Browse files Browse the repository at this point in the history
* feat: add initial version of argilla labeller task

* fix: arguments in runtime parameters

* feat: add field descriptions

* feat: Update record formatting logic during structured generation

* feat: update workflows

* refactor: work based off server payloads

* fix: resolve serializatione xample records

* fix: only convert examples w when provided

* fix: set to basically zero

* fix: add temperature fix

* fix: revert changes

* fix: example records with formatted responses

* fix: set max new tokens manually

* fix: some fixes in formatting

* refactor: some code quality improvements

* feat: improv

* refactor: remove unused code

* fix: wrong prompt template

* fix: remove print statement

* fix: added pydantic rtuntimeparameter definition

* fix: creating new characters per line examples

* fix: add nuance on example in prompt template

* feat: Add guidelines to prompt template

* fix: remove pdb trace

* fix: avoid using records without correct responses

* feat: add ability to forward different questions

* test: add tests for argilla labeller

* fix: wrong docstring

* fix: wrong docstring

* refactor: rename suggestions -> suggestion

* docs: update examples

* tests: remove span question

* docs: update the examples

* Apply suggestions from code review

Co-authored-by: Gabriel Martín Blázquez <[email protected]>

* refactor: apply suggestions code review

* fix: type hinting Record import

* fix: tests

* tests: fix failing tests

---------

Co-authored-by: Gabriel Martín Blázquez <[email protected]>
  • Loading branch information
davidberenstein1957 and gabrielmbmb authored Oct 3, 2024
1 parent 3fd680c commit a46489e
Show file tree
Hide file tree
Showing 6 changed files with 815 additions and 2 deletions.
2 changes: 1 addition & 1 deletion src/distilabel/llms/openai.py
Original file line number Diff line number Diff line change
Expand Up @@ -667,7 +667,7 @@ def _create_jsonl_row(
"""Creates a JSONL formatted row to be used by the OpenAI Batch API.
Args:
inputs: a list of inputs in chat format to generate responses for, optionally
input: a list of inputs in chat format to generate responses for, optionally
including structured output.
custom_id: a custom ID to use for the row.
kwargs: the keyword arguments to use for the generation.
Expand Down
1 change: 0 additions & 1 deletion src/distilabel/steps/clustering/text_clustering.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,7 +223,6 @@ def _create_figure(
inputs: The inputs of the step, as we will extract information from them again.
label2docs: Map from each label to the list of documents (texts) that belong to that cluster.
cluster_summaries: The summaries of the clusters, obtained from the LLM.
labels: The labels of the clusters (integers representing each predicted class).
"""
self._logger.info("🖼️ Creating figure for the clusters...")

Expand Down
2 changes: 2 additions & 0 deletions src/distilabel/steps/tasks/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

from distilabel.steps.tasks.argilla_labeller import ArgillaLabeller
from distilabel.steps.tasks.base import GeneratorTask, Task
from distilabel.steps.tasks.complexity_scorer import ComplexityScorer
from distilabel.steps.tasks.evol_instruct.base import EvolInstruct
Expand Down Expand Up @@ -52,6 +53,7 @@
__all__ = [
"GeneratorTask",
"Task",
"ArgillaLabeller",
"ComplexityScorer",
"EvolInstruct",
"EvolComplexity",
Expand Down
Loading

0 comments on commit a46489e

Please sign in to comment.