Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(weave): basic sdk for human feedback spec types #2801

Merged
merged 13 commits into from
Nov 8, 2024

Conversation

gtarpenning
Copy link
Member

@gtarpenning gtarpenning commented Oct 28, 2024

Description

WB-21670

Basic SDK types for human feedback. Use the objects api to create a feedback spec, which can be used in the UI to generate human scorer columns

Testing

adds test. eventually we will want an integration test going from the type to loading the UI, but thats blocked on CI not sucking and the frontend landing

@circle-job-mirror
Copy link

circle-job-mirror bot commented Oct 28, 2024

@gtarpenning gtarpenning marked this pull request as ready for review November 1, 2024 19:58
@gtarpenning gtarpenning requested a review from a team as a code owner November 1, 2024 19:58
from weave.trace_server.interface.base_object_classes import base_object_def


class HumanAnnotationColumn(base_object_def.BaseObject):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: "HumanAnnotationColumn" is a permanent name. It is sort of like a table name, so we have to be sure we like it (: I think there is a subtle world where there are AnnotationColumns that are not filled out by Humans... which might mean we drop the human part?

Copy link
Member Author

@gtarpenning gtarpenning Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that sounds right, do we then want a field within the class to denote that it came from a human? / is for a human

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There currently is creator and wb_user_id on the feedback table. I think using those are appropriate (which should be automatic right now)


# If provided, this feedback type will only be shown
# when a call is generated from the given op
op_scope: Optional[list[str]] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this expected to be a ref string or op name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want ref strings? like the frontend format "weave:///griffin_wb/prod-evals-aug/op/Evaluation.summarize:*"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see in your tests that this is just the name itself. you might consider using a ref here. then if you wanted to match any version you can use weave:///e/p/name:* (just like we support with op_name in the call filter

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, you commented before i finished reviewing. Looks like we are on the same page

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha yeah okay fixing



class HumanAnnotationColumn(base_object_def.BaseObject):
json_schema: dict
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is probably fine for now. you could add a field validator that uses https://python-jsonschema.readthedocs.io/en/latest/validate/ to check for a schema error

)

@field_validator("json_schema")
def validate_json_schema(cls, v: dict) -> dict:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting. there are sort of two things here:

  1. if you want to explicitly type a subset of allowed json schemas (like you did here), just create a few BaseModels and make a union - no need for jsonschema.
  2. If you want to allow any json schema, do:
try:
   jsonschema.validate(None,v)
except jsonschema.exceptions.SchemaError:
   raise InvalidJson
except jsonschema.exceptions.ValidationError:
   pass # we don't care that `None` does not conform
return v

@@ -0,0 +1,58 @@
from typing import Optional

import jsonschema
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you need to add this package? I wonder if the trace server already has this transitively.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont understand what this means... I see that the trace server has it in the requirements, how do I access it "transitively"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this is what I mean. in our pyproject.toml (inside core), there is no jsonschema. However, since litellm depends on jsonschema, you get it for free in trace server (when deployed).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but for these CI tests to pass you need to add jsonschema to the dev reqs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inside of

test = [
  "nox",
  "pytest>=8.2.0",
  "pytest-asyncio>=0.23.6",
  "pytest-cov>=5.0.0",
  "pytest-xdist>=3.1.0",
  "pytest-rerunfailures>=12.0",
  "pytest-rerunfailures>=14.0",
  "clickhouse_connect==0.7.0",
  "fastapi>=0.110.0",
  "sqlparse==0.5.0",

  # Integration Tests
  "pytest-recording>=0.13.2",
  "vcrpy>=6.0.1",

  # serving tests
  "flask",
  "uvicorn>=0.27.0",
  "pillow",
  "filelock",
  "httpx",
]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah got it thanks

@gtarpenning gtarpenning requested a review from tssweeney November 8, 2024 00:47
from weave.trace_server.interface.base_object_classes import base_object_def


class AnnotationColumn(base_object_def.BaseObject):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep going back and forth but I think this should actually be AnnotationSpec. Column doesn't really make that much sense.

@gtarpenning gtarpenning merged commit dfda2d6 into master Nov 8, 2024
115 checks passed
@gtarpenning gtarpenning deleted the griffin/human-feedback-sdk branch November 8, 2024 01:50
@github-actions github-actions bot locked and limited conversation to collaborators Nov 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants