-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dk-query comments #4529
dk-query comments #4529
Conversation
Thank you @dkuku. Looking at the impact this has in the codebase, I am thinking the PG route may be a better route after all. My thinking was that we would only add static comments but if we want both, for example to later add stacktrace information, I say the work on PostgreSQL is a better fit (and can play nicely with stuff like the |
My concern in postgrex was when I started disabling the prepared statements. I started reading about it and there are clear benefits of it. If someone would like to only add static comments to track where a query originates from, then there is no need to disable it. In this case the pr there can be even simpler and just include documentation that you should not add any data that changes. |
I understand that concern but can you provide examples of where the comment would be static and be based on the place the query was written and not where it was executed, i.e. the repository? |
You're right – most of the options can be added when we call the Repo and pass them as parameters. I created this PR mainly to retain the ability to use prepared queries with comments, especially when the comment includes, for example, only the app name executing the DB call. The traces and spans are a nice addition, especially since sqlcommenter has become a widely supported standard. My main use case is to identify where an expensive query originates. This type of comment can be cached in ETS, just like an existing query, which is why I thought Ecto might be a better fit. However, I’m happy to revert to the Postgrex pr. |
I think it is good to have explored both approaches. I think starting with PG is the best route at the moment :) Thank you ❤️ |
Out of curiosity, what is lacking currently from telemetry events for you to determine where a query originated from? |
You need to forward telemetry events to, for example, Datadog. The default Postgres integration does not understand telemetry but does support sqlcommenter. Is there a way to integrate it with telemetry here? Currently, a single database can be used by multiple frameworks. In the simplest scenario, you might migrate from Rails to Elixir, resulting in queries from two different frameworks appearing in the Postgres logs. Multiple Elixir microservices can also execute queries on the same database (e.g., when you extract part of a microservice into a separate one). Additionally, a single context module can be executed from multiple places (REST, GraphQL, gRPC, event consumers, Oban jobs, and periodic tasks). In large applications, you might have all of the above scenarios. In case something goes wrong, the more data you have for debugging, the better. |
I agree, currently working on a tool for that, to help with database observability, which leverages telemetry events and store them as time series in a automatically partitioned table based on which Ecto adapter is being used. I even reconcile dependencies so when clicking on the stacktrace it takes you to the repository. Anyway feel free to reach out if this is something that interest you, you can find my email in my github profile. |
To give some more context: having this functionality is required to enable DB monitoring on Datadog to link queries back to Ecto Otel spans (this does not work today). In fact, this is exactly how they do it in e.g. their nodejs library. |
How is telemetry not sufficient in this case? You can sent Ecto telemetry events to datadog or even export them to otel. |
It does not work. You need the |
@Schultzer telemetry events don't correlate with telemetry from the database itself. It won't tell you that your query is running a sequential scan. Passing the context (OpenTelemetry specifically) enables this capability and is why sqlcommenter exists and why I think it is important for ecto and postgrex to support this. |
@aloukissas For spans you can follow this guide: |
@dkuku oh snap :), interested in getting that up into https://github.com/open-telemetry/opentelemetry-erlang-contrib |
@tsloughter we can do that, I can also transfer the package ownership. |
@dkuku Cool. Are you on slack? Lets leave Ecto repo's PR comments :) |
This is continuation of the work that I started in postgrex repo - I think ecto will better benefit from it. For completnes I think an alternative to this pr may be to provide a callback that will receive the query generated query and allow to modify it with comment, but then we don't know if it could be cached and use as prepared query.
It adds a new field in Ecto.Query that aggregates comments.
The comments are then added to the generated query:
/*comment1*/ /*comment2*/ SELECT s0."x" FROM (/*subquery comment*/ SELECT ss0."x" AS "x" FROM "schema" AS ss0) AS s0
comments can be inspected:
Currently a comment can be a:
examples: module, function, app name, repo name, job name, host.
examples traces and spans, ip
This may be slower when compiling a query and also require more resources to parse the query in the db.
But not everyone is running db with 80% load. Also traces can be sampled and only a small percent could include the additional data.
I need it for sqlcommenter mostly and libraries in other languages do it similarly, rails has it built in.
What next:
*/
is included in comment. I think that it's better to replace any occurence of*/
with* /
to have any sql injection attempt logged and the data saved.by default the comments will be in front but sqlcommenter specification requires comments a the end.
comment()
configured similar as logger format with access to compile time env struct.this would allow to add metadata to a composed query from multiple files
format:
$line: $module.$function
will create a compile time comment that can be cached and prepared and the resulting sql would look like:/*22: MyApp.QueryBuilder.build*/ /*33: MyApp.QueryBuilder.filter*/ SELECT s0."x" FROM "schema" where z = ? AS s0
comment(:sqlcommenter, keyword)
that will accept create a sqlcommenter comment or require the use an external library to do the escaping and just accept the comment as string or iolist. The escaping logic is pretty simple and I think it could be included in ecto. This will require to specify in configecto: comment_position: :append
This may require flushing previous comments or merging them into one - sqlcommenter specification says that there should be only one. But we can for example aggrgate key value pairs and the build it all at the end into a single comment, additional config option may be required.