Skip to content

Commit

Permalink
Removing Prepare() call
Browse files Browse the repository at this point in the history
  • Loading branch information
sebastienros authored Feb 23, 2018
1 parent 390c8c5 commit 47946e4
Showing 1 changed file with 0 additions and 4 deletions.
4 changes: 0 additions & 4 deletions src/Benchmarks/Data/RawDb.cs
Original file line number Diff line number Diff line change
Expand Up @@ -63,10 +63,6 @@ DbCommand CreateReadCommand(DbConnection connection)
id.Value = _random.Next(1, 10001);
cmd.Parameters.Add(id);

// Prepared statements improve PostgreSQL performance by 10-15%
// Especially if you only call them once, instead of on every execution :)
cmd.Prepare();

return cmd;
}

Expand Down

12 comments on commit 47946e4

@roji
Copy link
Member

@roji roji commented on 47946e4 Feb 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this isn't a permanent change, as it will significantly degrade performance... Is this somehow the source of the issue you've been seeing?

@sebastienros
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will significantly degrade performance

Looking at the graphs it significantly improved them already, on all scenarios. For Npgsql we are now using automatic preparation, which increase the perf for Dapper and EF based scenarios who don't expose the method. So we are not dropping the preparation altogether. Maybe you have fixed this in your perf branch?

But also it improved the reliability issues that made me look into this in the first place. Maybe there is an issue in the driver that would do unnecessary things when it was called, but look at the graphs for Raw and you will see that all the spikes down went away once we removed the calls.

In the case of other drivers, MySql doesn't implement this method at all. I think SqlClient is a no-op as the server does automatic query plans on the server. But I have heard at some point that the client was still doing something, I will investigate. But right now we are not measuring it on TE. We still have to possibility to add another implementation specific for each driver too.

Maybe there is a discussion to have in terms of APIs, as the command execution repetition should be handled by the driver (once per physical connection in the case of npgsl) but the user needs to call it on every command. Also because it can't be exposed from an ORM. Then in the end a solution that works on both issues it to configure the behavior on the connection string and the frequency of each statement like you did. I saw the same thing in another framework provider on TE (don't recall which one). That might also be part of a best practice for driver implementations if we agree on that. I'll raise the concern during our next meeting. aspnet/DataAccessPerformance#31

@sebastienros
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we made the change, better performance and better reliability

  • red is middleware Raw
  • blue is mvc Raw

image

@bgrainger
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of other drivers, MySql doesn't implement this method at all.

The relevant issue is mysql-net/MySqlConnector#397.

(Since it's currently unimplemented, I don't have any hard numbers to back this up, but I don't see how Prepare would improve performance for MySQL unless the same MySqlCommand objects are being cached and reused, which isn't the case here: LoadSingleQueryRow disposes the MySqlCommand created by CreateReadCommand.)

@divega
Copy link

@divega divega commented on 47946e4 Feb 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roji I am curious if this could simply mean that there is a bug in Prepare. Also does it do actual work on the server when you call it or does it set a flag so that the first execution prepares?

@roji
Copy link
Member

@roji roji commented on 47946e4 Feb 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will significantly degrade performance

Looking at the graphs it significantly improved them already, on all scenarios. For Npgsql we are now using automatic preparation, which increase the perf for Dapper and EF based scenarios who don't expose the method. So we are not dropping the preparation altogether. Maybe you have fixed this in your perf branch?

Are you saying that after removing Prepare(), you saw a performance increase in the raw scenarios with Npgsql? I'm assuming that you turned on autoprepare (via the connection string), right? That's quite surprising - my assumption is that explicit preparation (i.e. calling Prepare()) should be somewhat faster than automatic preparation, since you're bypassing all the book-keeping associated with autopreparation (counting how many times statements were used, etc...). So at least in theory, applications which can prepare explicitly should do that, and autoprepare should only be used where that's not possible (dapper, EF). I wonder what's going on there, I'll try to investigate at some point. In addition, if explicit preparation causes reliability issues that's definitely something I need to fix.

In the case of other drivers, MySql doesn't implement this method at all. I think SqlClient is a no-op as the server does automatic query plans on the server. But I have heard at some point that the client was still doing something, I will investigate. But right now we are not measuring it on TE. We still have to possibility to add another implementation specific for each driver too.

Regarding the general usefulness of Prepare():

  • It's true that SqlClient's Prepare() is currently (mostly) a no-op, but we discussed this with @divega and @saurabh500 and there could be some performance gains for doing so. Explicit preparation allows you to avoid sending the SQL to the server every single time, and also to avoid sending the description of the shape of the resultset. So in the future preparing on SqlClient could definitely be important.
  • After some discussion with @bricelam, it seems that preparing on Sqlite would also be beneficial, even if I think that the driver currently doesn't do anything.
  • Don't forget that autoprepare is currently an Npgsql-only feature (I've just opened #27448 to track implementing this in a provider-independent way).

So at least in the mid- and long-term, I think we should keep treating explicit Prepare() as an important part of the performance story...

@bgrainger see https://github.com/dotnet/corefx/issues/27447 which I've just opened. In a nutshell, Npgsql keeps the prepared statements internally for when the same SQL is executed (on a different DbCommand or even on a different pooled DbConnection using the same physical connection). This allows preparation to work for even as DbCommand and DbConnection instances are cycled.

@roji
Copy link
Member

@roji roji commented on 47946e4 Feb 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@divega our messages crossed. Yeah, this is pointing towards some sort of bug - which is quite surprising. I also can't explain how I didn't run into this in all my experimentation... I'll try to look into it (but it probably won't happen very soon).

To answer your question, Prepare() does its job when it's called, rather than setting a flag (after all, it's supposed to be the one heavy operation you do, after which the executions are more optimized). However, executing a prepared statement is of course a different flow (and involves different protocol messages too), so it could be a bug in the prepared flow specifically...

@sebastienros
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto Prepare made the raw benchmarks slightly faster in comparison to direct calls to Prepare. See the red chart. And also more stable, which tends to prove there is an issue under load.

@roji
Copy link
Member

@roji roji commented on 47946e4 Feb 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebastienros that's very weird, but thanks for all the work on it. I'll try to investigate this when I can...

@roji
Copy link
Member

@roji roji commented on 47946e4 Feb 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebastienros maybe it's worth having both scenarios side-by-side, if it's not too much work to add?

@sebastienros
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will just need to configure a different database provider as we need a separate connection string too. Maybe we can already test it in Data Access Benchmark instead.

@roji
Copy link
Member

@roji roji commented on 47946e4 Feb 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I plan to go back to perf work in a few days (am currently deep in EF Core) so I'll update you on how it goes.

Please sign in to comment.