chore(weave): basic sdk for human feedback spec types #2801

gtarpenning · 2024-10-28T21:53:40Z

Description

Basic SDK types for human feedback. Use the objects api to create a feedback spec, which can be used in the UI to generate human scorer columns

Testing

adds test. eventually we will want an integration test going from the type to loading the UI, but thats blocked on CI not sucking and the frontend landing

circle-job-mirror · 2024-10-28T21:56:26Z

Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=78b7a548e126099641750a088e3ad4a51065328c

tssweeney · 2024-11-07T18:24:39Z

weave/trace_server/interface/base_object_classes/human_annotation_column.py

+from weave.trace_server.interface.base_object_classes import base_object_def
+
+
+class HumanAnnotationColumn(base_object_def.BaseObject):


note: "HumanAnnotationColumn" is a permanent name. It is sort of like a table name, so we have to be sure we like it (: I think there is a subtle world where there are AnnotationColumns that are not filled out by Humans... which might mean we drop the human part?

yep, that sounds right, do we then want a field within the class to denote that it came from a human? / is for a human

There currently is creator and wb_user_id on the feedback table. I think using those are appropriate (which should be automatic right now)

tssweeney · 2024-11-07T18:25:24Z

weave/trace_server/interface/base_object_classes/human_annotation_column.py

+
+    # If provided, this feedback type will only be shown
+    # when a call is generated from the given op
+    op_scope: Optional[list[str]] = None


is this expected to be a ref string or op name?

I think we want ref strings? like the frontend format "weave:///griffin_wb/prod-evals-aug/op/Evaluation.summarize:*"

i see in your tests that this is just the name itself. you might consider using a ref here. then if you wanted to match any version you can use weave:///e/p/name:* (just like we support with op_name in the call filter

oh, you commented before i finished reviewing. Looks like we are on the same page

haha yeah okay fixing

tssweeney · 2024-11-07T18:30:05Z

weave/trace_server/interface/base_object_classes/human_annotation_column.py

+
+
+class HumanAnnotationColumn(base_object_def.BaseObject):
+    json_schema: dict


this is probably fine for now. you could add a field validator that uses https://python-jsonschema.readthedocs.io/en/latest/validate/ to check for a schema error

tssweeney · 2024-11-07T18:56:59Z

weave/trace_server/interface/base_object_classes/annotation_column.py

+    )
+
+    @field_validator("json_schema")
+    def validate_json_schema(cls, v: dict) -> dict:


interesting. there are sort of two things here:

if you want to explicitly type a subset of allowed json schemas (like you did here), just create a few BaseModels and make a union - no need for jsonschema.

If you want to allow any json schema, do:

try: jsonschema.validate(None,v) except jsonschema.exceptions.SchemaError: raise InvalidJson except jsonschema.exceptions.ValidationError: pass # we don't care that `None` does not conform return v

tssweeney · 2024-11-07T18:57:34Z

weave/trace_server/interface/base_object_classes/annotation_column.py

@@ -0,0 +1,58 @@
+from typing import Optional
+
+import jsonschema


did you need to add this package? I wonder if the trace server already has this transitively.

I dont understand what this means... I see that the trace server has it in the requirements, how do I access it "transitively"?

yeah, this is what I mean. in our pyproject.toml (inside core), there is no jsonschema. However, since litellm depends on jsonschema, you get it for free in trace server (when deployed).

but for these CI tests to pass you need to add jsonschema to the dev reqs

inside of

test = [ "nox", "pytest>=8.2.0", "pytest-asyncio>=0.23.6", "pytest-cov>=5.0.0", "pytest-xdist>=3.1.0", "pytest-rerunfailures>=12.0", "pytest-rerunfailures>=14.0", "clickhouse_connect==0.7.0", "fastapi>=0.110.0", "sqlparse==0.5.0", # Integration Tests "pytest-recording>=0.13.2", "vcrpy>=6.0.1", # serving tests "flask", "uvicorn>=0.27.0", "pillow", "filelock", "httpx", ]

Yeah got it thanks

gtarpenning · 2024-11-08T00:59:18Z

weave/trace_server/interface/base_object_classes/annotation_column.py

+from weave.trace_server.interface.base_object_classes import base_object_def
+
+
+class AnnotationColumn(base_object_def.BaseObject):


I keep going back and forth but I think this should actually be AnnotationSpec. Column doesn't really make that much sense.

chore(weave): basic sdk for human feedback spec types

092e91b

gtarpenning added 3 commits October 29, 2024 17:22

use schema

9be5066

Merge branch 'master' into griffin/human-feedback-sdk

70f91dc

use object creator

b291b2c

gtarpenning marked this pull request as ready for review November 1, 2024 19:58

gtarpenning requested a review from a team as a code owner November 1, 2024 19:58

gtarpenning added 2 commits November 7, 2024 10:00

make generate base_object_classes

c234878

merge

0caf562

tssweeney reviewed Nov 7, 2024

View reviewed changes

gtarpenning added 2 commits November 7, 2024 10:48

review comments

0c66eb6

jsonschema

f79c71a

tssweeney reviewed Nov 7, 2024

View reviewed changes

gtarpenning added 3 commits November 7, 2024 14:04

s

0adfc4a

add jsonschema to tests

1dee971

fix test

850972c

gtarpenning requested a review from tssweeney November 8, 2024 00:47

gtarpenning commented Nov 8, 2024

View reviewed changes

gtarpenning added 2 commits November 7, 2024 17:06

AnnotationSpec

b848f88

tests

f4065cd

tssweeney approved these changes Nov 8, 2024

View reviewed changes

gtarpenning merged commit dfda2d6 into master Nov 8, 2024
115 checks passed

gtarpenning deleted the griffin/human-feedback-sdk branch November 8, 2024 01:50

github-actions bot locked and limited conversation to collaborators Nov 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(weave): basic sdk for human feedback spec types #2801

chore(weave): basic sdk for human feedback spec types #2801

gtarpenning commented Oct 28, 2024 •

edited

Loading

circle-job-mirror bot commented Oct 28, 2024 •

edited

Loading

tssweeney Nov 7, 2024

gtarpenning Nov 7, 2024 •

edited

Loading

tssweeney Nov 7, 2024

tssweeney Nov 7, 2024

gtarpenning Nov 7, 2024

tssweeney Nov 7, 2024

tssweeney Nov 7, 2024

gtarpenning Nov 7, 2024

tssweeney Nov 7, 2024

tssweeney Nov 7, 2024

tssweeney Nov 7, 2024

gtarpenning Nov 7, 2024

tssweeney Nov 7, 2024

tssweeney Nov 7, 2024

tssweeney Nov 7, 2024

gtarpenning Nov 7, 2024

gtarpenning Nov 8, 2024

		from weave.trace_server.interface.base_object_classes import base_object_def


		class HumanAnnotationColumn(base_object_def.BaseObject):



		class HumanAnnotationColumn(base_object_def.BaseObject):
		json_schema: dict

		@@ -0,0 +1,58 @@
		from typing import Optional

		import jsonschema

		from weave.trace_server.interface.base_object_classes import base_object_def


		class AnnotationColumn(base_object_def.BaseObject):

chore(weave): basic sdk for human feedback spec types #2801

chore(weave): basic sdk for human feedback spec types #2801

Conversation

gtarpenning commented Oct 28, 2024 • edited Loading

Description

Testing

circle-job-mirror bot commented Oct 28, 2024 • edited Loading

Choose a reason for hiding this comment

gtarpenning Nov 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gtarpenning commented Oct 28, 2024 •

edited

Loading

circle-job-mirror bot commented Oct 28, 2024 •

edited

Loading

gtarpenning Nov 7, 2024 •

edited

Loading