-
-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MBS-13453: Avoid double-decoding some errors #3323
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Changes makes perfect sense.
Would it be feasible to make the test described in commit message a CI test?
Do you remember how you came to that conclusion? Because I'm not sure that that's the issue...I tried playing with The "Wide character" error that happens when you run the tests is being thrown by the I don't particularly like adding eval { $msg = decode_utf8($msg); };
$exception->throw( sqlstate => $state, message => $msg ); |
I honestly have zero memory of this by now. I think I just looked at what stopped breaking when changing things in different places, and this was the one I found, but it might be deeper than that. I do remember being very confused about why the stuff was different in some cases than others, at least... Happy to try with your option, anyway, if you tested that it does the same. |
a2fc983
to
4fb2bfe
Compare
I tested your suggestion now and it seems to work - I still don't understand what the origin of the issue is and would like to figure it out, but maybe this is good enough for now. We should have a comment explaining the issue though and why we have the |
Well, I'm also not sure why we have an I'm actually wondering if we should remove the decode entirely and intentionally regress on MBS-11207.
|
I can live with that as well if you prefer - ideally we'd still figure out where this is coming from, but... |
This isn't exactly right. After inspecting the diff --git a/lib/MusicBrainz/Server/Connector.pm b/lib/MusicBrainz/Server/Connector.pm
index b0f5b66222..871037b727 100644
--- a/lib/MusicBrainz/Server/Connector.pm
+++ b/lib/MusicBrainz/Server/Connector.pm
@@ -3,6 +3,7 @@ use Moose;
use MusicBrainz::Server::Exceptions;
use DBIx::Connector;
use Sql;
+use Encode qw( decode_utf8 );
has 'conn' => (
isa => 'DBIx::Connector',
@@ -55,6 +56,10 @@ sub _build_conn
my $exception = 'MusicBrainz::Server::Exceptions::DatabaseError';
$exception .= '::StatementTimedOut'
if $state eq '57014';
+ # Sometimes we receive a byte string that doesn't have the UTF8
+ # flag set; other times it's already been decoded (MBS-11207).
+ $msg = decode_utf8($msg)
+ unless utf8::is_utf8($msg);
$exception->throw( sqlstate => $state, message => $msg );
},
RaiseError => 0, |
While the idea in 20640de was correct, the change was too wide in application; it was also decoding errors that did not need it, causing a wide character ISE because of the double decoding. The double-decoding could be triggered (before this patch) by doing a utf8 change on a sql data file for a test and then running it - I changed work.sql to have an invalid work ID (to trigger an error) and ♥ instead of Test as a work name. Then I just triggered the error with prove -lv t/tests.t :: --tests Edit::Work::Create mwiencek looked into the root causes and said: "We do always get a UTF-8 string. But sometimes it's just a raw byte string (with no UTF8 flag set), and other times it's already been decoded (UTF8 flag on)." So this checks whether the flag is on, and only decodes the string if it is not. Co-Authored-By: Michael Wiencek <[email protected]>
Ok, that does seem more elegant. Ideally we'd still find out why only some come pre-decoded, but it's a lot less messy than before. Changed (and changed the commit message accordingly). |
Thanks, I replaced parts of the PR description with the updated commit message to avoid confusion if we have to revisit this. |
Fix MBS-13453
Problem
We are getting wide error ISEs that hide the real errors sometimes when there are PSQL problems.
Solution
While the idea in 20640de was correct, the change was too wide in application; it was also decoding errors that did not need it, causing a wide character ISE because of the double decoding.
mwiencek looked into the root causes and said: "We do always get a UTF-8 string. But sometimes it's just a raw byte string (with no UTF8 flag set), and other times it's already been decoded (UTF8 flag on)."
So this checks whether the flag is on, and only decodes the string if it is not.
Testing
Manually.
This specific issue can be triggered for testing by changing the host in
DBDefs
to something dumb and utf8ish. I used'♥localhost'
.The double-decoding could be triggered (before this patch) by doing a similar change on a
.sql
data file for a test and then running it - I changedwork.sql
to have an invalid artist ID (to trigger an error) and'♥'
instead of'ABBA'
as an AC name. Then I just triggered the error withprove -lv t/tests.t :: --tests Edit::Work::Create