Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI: instrument embeddings and chat completions #2398

Merged
merged 77 commits into from
Feb 14, 2024

Conversation

kaylareopelle
Copy link
Contributor

@kaylareopelle kaylareopelle commented Jan 18, 2024

Initial PR for OpenAI instrumentation. We're merging this into the openai_instrumentation branch to preserve one historical PR for our initial instrumentation rollout to hopefully make it easier for others to reference in the future.

Co-authored by @hannahramadan
Closes #2403
Closes #2404

This PR includes:

  • Basic instrumentation setup for the ruby-openai gem
    • Support for versions 3.4.0+, based on versions currently in use by our customers (dashboard)(internal)
    • Create of an ai multiverse runner group
    • Add ai/ruby_openai testing to the appropriate CI workflows
  • Instrumentation for OpenAI::Client#embeddings:
    • Record a segment on every method invocation
    • Record a metric on every method invocation
    • Create LlmEmbedding events
  • Instrumentation for OpenAI::Client#chat, known in the spec as chat completions:
    • Record a segment on every method invocation
    • Record a metric on every method invocation
    • Create LlmChatCompletionSummary events
    • Create LlmChatCompletionMessage events
  • Add attributes to LlmEmbedding and LlmChatCompletionSummary events from the Net::HTTP response headers
  • Set llm: true as an attribute on all transaction events with LLM-related segments
  • Test using mocked responses from real OpenAI requests

It also makes the following updates to the Llm-namespaced classes/modules:

  • Update the LlmEvent and other Llm classes to use the attribute key strings expected by the UI
  • Add an llm_event attribute to AbstractSegment
  • Organize attribute names to reflect the order of the spec

This PR does not contain the following GA-required tasks:

  • Changelog entry
  • Tests for the Net::HTTP response header assignment to the LLM events
  • Tests that validate the OpenAI instrumentation correctly assigns LLM event attribute values
  • Error tracing attributes
  • Feedback API
  • The overall ai_monitoring.enabled configuration option

These items will be opened in separate PRs that we'll merge into the openai_instrumentation feature branch before GA.

@kaylareopelle kaylareopelle self-assigned this Jan 22, 2024
@kaylareopelle kaylareopelle changed the title OpenAI instrumentation scaffold OpenAI instrumentation Jan 24, 2024
kaylareopelle and others added 20 commits January 29, 2024 09:30
Some attribute names are expected upstream to include periods. We can't use names with periods in setter/getter method names. To resolve this, a new ATTRIBUTE_NAME_EXCEPTIONS constant, which is accompanied by an attribute_name_exceptions method exists to store the string value of the attribute name.

Then, replace_attr_with_string returns the correct string name when we're creating the final event_attributes hash.
Embedding and ChatCompletionSummary both need guids for their ids. ChatCompletionMessage does not. Let's set the guid as the default, and allow ChatCompletionMessage to override the guid.
Our instrumentation was changing the response when called in the begin block.

executes do
if use_prepend?
if OPENAI_VERSION >= Gem::Version.new('5.0.0')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool to see the support for older versions.

vendor: VENDOR,
conversation_id: conversation_id,
api_key_last_four_digits: parse_api_key,
request_max_tokens: parameters[:max_tokens] || parameters['max_tokens'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What leads the hash to have symbols or strings for keys, and are there ever both types of key present in the same hash? It might be handy to have a helper method like this:

request_max_tokens: parameter_value(parameters, :max_tokens)

...

def parameter_value(parameters, value)
  parameters[value] || parameters[value.to_s]
end

and if we can rely on the hash keys being all symbols or all strings, we could further enhance the helper to memoize which key type is involved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both Strings and Symbols are accepted as keys, so it's all up to the user on which they use. And yes—both types can be mix and matched in the same request.

Side note: We did performance testing of || vs kind_of? vs is_a? and found || to be the most performant by about 1.2x, which is why we went with an "or" check vs creating a helper method that would decide which to use.

end

def parse_api_key
'sk-' + headers['Authorization'][-4..-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... it looks like the ability to use [-4..] instead of [-4..-1] wasn't introduced until Ruby v2.6.

def attribute_name_exceptions
# TODO: OLD RUBIES < 2.6
# Hash#merge accepts multiple arguments in 2.6, so we can reduce this
# to a single Hash#merge call with two arguments at that point
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where better granular perf tests would help. If there's a significant performance gain to be had, it'd be worth writing the code both ways:

if RUBY_VERSION >= 2.6
  new_way
else
  old way
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great call out! Ran a perf test a few times and item.merge(item1).merge(item2) is consistently ~1.25x slower. We will break this out into a conditional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I wanted to use a ternary operator but it got super long 36d6cab

sequence: index,
completion_id: summary_id,
vendor: VENDOR,
is_response: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the spec, this shouldn't be set to false, it should be not included if it would be false

set to True if a message is the result of a chat completion and not an input message - omitted in False cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great callout! Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been updated! cdd42f6

@hannahramadan hannahramadan changed the base branch from dev to openai_instrumentation February 13, 2024 22:34
CHANGELOG.md Outdated Show resolved Hide resolved
@kaylareopelle kaylareopelle changed the title OpenAI instrumentation OpenAI: instrument embeddings and chat completions Feb 14, 2024
@kaylareopelle kaylareopelle merged commit faf9ab9 into openai_instrumentation Feb 14, 2024
27 of 28 checks passed
@hannahramadan hannahramadan deleted the openai branch April 12, 2024 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

OpenAI: Instrument chat completions OpenAI: Instrument embeddings
4 participants