Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add row count and affected rows to Trilogy spans #1290

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

skipkayhil
Copy link

Identifying queries that return/update many rows is very useful for improving database performance. Queries that return many rows can cause applications to use excess memory (or in some cases like Vitess) could be blocked altogether. Queries that update many rows can cause replication lag and generally put more pressure on a database.

This commit adds both of these values (row_count and affected_rows) to spans created from Trilogy#query to help identify these problematic queries. Trilogy::Result#count will always return a value (0 for mutations), but #affected_rows only returns values for mutations (nil for SELECTs).

Copy link

linux-foundation-easycla bot commented Dec 5, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: skipkayhil / name: Hartley McGuire (6bfc5d3)

@@ -164,6 +164,12 @@
_(span.attributes[OpenTelemetry::SemanticConventions::Trace::DB_SYSTEM]).must_equal 'mysql'
_(span.attributes[OpenTelemetry::SemanticConventions::Trace::DB_STATEMENT]).must_equal 'DESELECT ?'
end

it 'includes row count' do
client.query('SELECT 1')
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only see SELECTs in this test file, so I'm not sure what the best way to test affected_rows is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to run an insert and subsequent update command to assert the behaviour here, no?

super
result = super

span.set_attribute('db.response.returned_rows', result.count)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db.response.returned_rows was recently added in open-telemetry/semantic-conventions@63094c6 but appears to be experimental. Is this (and .affected_rows) a reasonable key to use?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

affected_rows seems to be the accepted language for this concept, so I don't see any problem there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a few challenges here:

  1. I am trying to avoid having mixed semantic conventions here between pre-1.0 and 1.x+. This would be adding new attributes which would be a mix and likely confusing to our users.
  2. We have in the past rejected adding metrics as attributes for other PRs but it seems like now metrics are being adding as span attributes. Does that mean we should reverse our decisions going forward?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have in the past rejected adding metrics as attributes for other PRs but it seems like now metrics are being adding as span attributes. Does that mean we should reverse our decisions going forward?

Feeling a bit out of the loop on this one. Has there been some push back against reporting numeric details on spans?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We have had folks submit PRs to add things like internal rails attributes for db times and rendering times.

I have said no to these kinds of attributes in the past because these are either derived span values or more appropriate to emit as metrics. I can look up specific instances of PRs.

Now that semconv is setting a precedent I am second guessing that position.

Copy link
Contributor

@robertlaurin robertlaurin Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to avoid having mixed semantic conventions here between pre-1.0 and 1.x+. This would be adding new attributes which would be a mix and likely confusing to our users.

I wonder how we move forward then. I hoped there would be less resistant to this change as it is additive rather than modifying an existing attribute.

Just a quick scan of the db. prefixed attributes in the sem conv main branch.

Experimental attributes:

  • db.system
  • db.collection.name (this probably maps to db.name)
  • db.operation.name (we have db.operation)
  • db.response.status_code
  • db.operation.batch.size
  • db.query.summary
  • db.query.text ( we have db.statement )
  • db.response.returned_rows
  • Nothing in the current experimental versioning maps to the db.user attribute we have
  • Nothing maps to affected_rows

I see three options:

  • I could come up with something to make the attribute versioning configurable, that feels a bit toily but maybe useful going forward
  • We just add these new attributes because one is unspecced, and one is experimental (but we're not breaking anything that exists)
  • We (shopify) look at bringing this instrumentation internal, because at the end of the day the instrumentation needs to solve problems for users and I don't really want to allow the slow moving beast that is spec compliance to prevent that.

I have said no to these kinds of attributes in the past because these are either derived span values or more appropriate to emit as metrics. I can look up specific instances of PRs.

I see what you're saying, I think in this instance it's a bit different as we want to associate these counts returned with actual queries, the high cardinality nature of this stuff doesn't mesh well as metrics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these attributes seem really helpful for users and I'm excited by this PR. Thanks for opening it, @skipkayhil! I'd like to find a way to add these attributes to the instrumentation, while still supporting the semantic conventions.

I am trying to avoid having mixed semantic conventions here between pre-1.0 and 1.x+. This would be adding new attributes which would be a mix and likely confusing to our users.

I wonder how we move forward then. I hoped there would be less resistant to this change as it is additive rather than modifying an existing attribute.

@arielvalentin @robertlaurin - this sounds like a great opportunity to bring in OTEL_SEMCONV_STABILITY_OPT_IN. There's a description for this environment variable at the top of the Database Semantic Conventions.

With this environment variable, we can have a clear toggle to add new conventions without needing to interfere with users who are happy with the existing conventions.

Database semantic conventions as so close to going stable. We eventually need to remove the old conventions and adopt the new ones. I think we should start that process now, and introduce db.response.returned_rows as part of this move.

@skipkayhil, would you be interested in helping the Trilogy instrumentation support OTEL_SEMCONV_STABILITY_OPT_IN? We don't have any database instrumentation prototypes yet. I'd be happy to help work on something with you if you're interested.

affected_rows seems to be the accepted language for this concept, so I don't see any problem there.

I think this sounds like a great name for this concept, but I'm a little hesitant to add an attribute that isn't part of the semantic conventions yet.

What do you think about pausing on that attribute and opening an issue to propose adding db.response.affected_rows to the semantic conventions? Here's the form to propose a new convention. Also happy to help shepherd that conversation if you'd like.

@kaylareopelle kaylareopelle changed the title Add row count and affected rows to Trilogy spans feat: Add row count and affected rows to Trilogy spans Dec 5, 2024
Comment on lines 63 to 64
span.set_attribute('db.response.returned_rows', result.count)
span.set_attribute('db.response.affected_rows', result.affected_rows) unless result.affected_rows.nil?
Copy link
Contributor

@robertlaurin robertlaurin Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're adding more than one attribute you want to use the add_attributes interface for bulk setting attributes, otherwise you hit a mutex N many times, versus just the 1 time.

Suggested change
span.set_attribute('db.response.returned_rows', result.count)
span.set_attribute('db.response.affected_rows', result.affected_rows) unless result.affected_rows.nil?
attributes = { 'db.response.returned_rows' => result.count }
attributes['db.response.affected_rows'] = result.affected_rows unless result.affected_rows.nil?
span.add_attributes(attributes)

Identifying queries that return/update many rows is very useful for
improving database performance. Queries that return many rows can cause
applications to use excess memory (or in some cases like Vitess) could
be blocked altogether. Queries that update many rows can cause
replication lag and generally put more pressure on a database.

This commit adds both of these values (row_count and affected_rows) to
spans created from Trilogy#query to help identify these problematic
queries. Trilogy::Result#count will always return a value (0 for
mutations), but #affected_rows only returns values for mutations (nil
for SELECTs).
@skipkayhil skipkayhil force-pushed the hm-trilogy-rows-attributes branch from be86418 to 6bfc5d3 Compare December 10, 2024 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants