Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement custom bedrock json parser for v1 & v2 #25

Closed
wants to merge 14 commits into from

Conversation

yiyuan-he
Copy link

@yiyuan-he yiyuan-he commented Dec 11, 2024

Description of changes:

Implement custom bedrock json parser for our Java v1 & v2 SDK auto-instrumentation using only native language features to avoid introducing external dependencies.

TODO:

  • Address PR comments from @mxiamxia Pending further review.
  • Implementation parity between v2 and v1 Done.
  • Add unit tests for custom parser Tentatively done. May need to add more test cases
  • Update behavior to not add attribute when value is null. Currently it assigns "null" as attribute value.
  • Add support for new Amazon Nova Models. Done for both v1 & v2.
  • Implement custom parser for Java v1. Updated this PR with v1 changes.
  • Investigate why only Mistral contract test case is failing. There is minor rounding difference when using custom parser. Had to update the assertion for gen_ai.usage.input_tokens from 15 to 16.

Test plan:

  • Added unit tests for various edge cases.

  • Ran the existing contract tests in our ADOT package and verified the test cases for all 6 models are passing.

Screenshot 2024-12-11 at 2 33 07 PM

  • Verified generated spans for 6 models that we currently support. Also added some logging statements to measure the processing time of the custom parser.

Amazon Nova
amazon nova

Amazon Titan
amazon titan

Anthropic Claude
anthropic claude

AI21 Jamba
ai21 jamba

Mistral
mistral

Meta Llama
meta llama

Cohere Command R
cohere command r

@yiyuan-he yiyuan-he changed the title set up custom bedrock json parser Set up custom bedrock json parser for Java v2 Dec 11, 2024
@yiyuan-he yiyuan-he changed the title Set up custom bedrock json parser for Java v2 POC: Set up custom bedrock json parser for Java v2 Dec 11, 2024
@yiyuan-he yiyuan-he changed the title POC: Set up custom bedrock json parser for Java v2 PoC: Set up custom bedrock json parser for Java v2 Dec 11, 2024
@yiyuan-he yiyuan-he changed the title PoC: Set up custom bedrock json parser for Java v2 PoC: Set up custom bedrock json parser for v1 & v2 Dec 11, 2024
@yiyuan-he yiyuan-he changed the base branch from v1_32_1_dev to v1_33_6_patch December 11, 2024 22:55
@yiyuan-he yiyuan-he changed the base branch from v1_33_6_patch to v1_32_1_dev December 11, 2024 22:56
Comment on lines +24 to +27
public JsonParser(String json) {
this.json = json.trim();
this.position = 0;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid to create new JsonParser instances with JsonString text from constructor. Can we update public Map<String, Object> parse() to static and have it take the JsongString text as input so we can call JsonParser.parse(json) directly in the caller method?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I found JsonParser needs to maintain position state in each instance for each data processing. is it necessary?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, suggest changing JsonParser to something like JSONObject

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JsonParser needs to maintain some kind of position state since we are parsing the string character by character.

However, this state does not need to be exposed as public. I think we can hide this state from the user by making it private instead.

Comment on lines 62 to 63
BedrockJsonParser.JsonParser parser = new BedrockJsonParser.JsonParser(jsonString);
Map<String, Object> jsonBody = parser.parse();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BedrockJsonParser.JsonObject jsonObject = new BedrockJsonParser.JsonObject(jsonString);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will implement something similar with this:

// Parse the LLM JSON string into a Map
LlmJson llmJson = BedrockJsonParser.parse(jsonString);

return null;
}

private static Object resolvePath(Map<String, Object> json, String path) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest changing ot the custom JsonObject as input instead of generic Map which will restrict to pass in the correct/expect data while calling this method

return result.toString();
}

private Object readValue() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found an open-source mini-json parser without requiring Dep, which has very similar Impl like yours which is good. :) Could double check test a few things? Will you impl be able to handle the attribute value with special chars such as \n?

https://github.com/ralfstx/minimal-json/blob/master/com.eclipsesource.json/src/main/java/com/eclipsesource/json/JsonParser.java#L159

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the unit tests for the json utils? We test against different bedrock response format pattern and edge cases

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added unit tests to cover special chars and unicode that may appear in the input/output prompts. Also added some test cases to cover empty objects and arrays.

@yiyuan-he yiyuan-he changed the title PoC: Set up custom bedrock json parser for v1 & v2 Implement custom bedrock json parser for v1 & v2 Dec 13, 2024
Comment on lines 47 to 81
try {
JsonNode jsonBody;
// Extract JSON string from target if it is a Bedrock Runtime JSON blob
String jsonString;
if (target instanceof SdkBytes) {
String jsonString = ((SdkBytes) target).asUtf8String();
jsonBody = objectMapper.readTree(jsonString);
jsonString = ((SdkBytes) target).asUtf8String();
} else {
if (target != null) {
return target.toString();
}
return null;
}

// Parse the LLM JSON string into a Map
LlmJson llmJson = BedrockJsonParser.parse(jsonString);

// Use attribute name to extract the corresponding value
switch (attributeName) {
case "gen_ai.request.max_tokens":
return getMaxTokens(jsonBody);
return llmJson.getMaxTokens();
case "gen_ai.request.temperature":
return getTemperature(jsonBody);
return llmJson.getTemperature();
case "gen_ai.request.top_p":
return getTopP(jsonBody);
return llmJson.getTopP();
case "gen_ai.response.finish_reasons":
return getFinishReasons(jsonBody);
return llmJson.getFinishReasons();
case "gen_ai.usage.input_tokens":
return getInputTokens(jsonBody);
return llmJson.getInputTokens();
case "gen_ai.usage.output_tokens":
return getOutputTokens(jsonBody);
return llmJson.getOutputTokens();
default:
return null;
}
} catch (JsonProcessingException e) {
} catch (RuntimeException e) {
return null;
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic should swallow all Exceptions raised by the custom parser so that we do not propagate it to the application level

@yiyuan-he yiyuan-he changed the base branch from v1_32_1_dev to v1_33_6_patch December 16, 2024 21:55
@yiyuan-he yiyuan-he changed the base branch from v1_33_6_patch to v1_32_1_dev December 16, 2024 21:56
Copy link
Owner

@mxiamxia mxiamxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@yiyuan-he
Copy link
Author

Closing this PR since #27 was merged to our new v1_33_6_patch branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants