Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connections not properly closed on errors #250

Open
chr1sjf0x opened this issue Mar 19, 2019 · 8 comments
Open

Connections not properly closed on errors #250

chr1sjf0x opened this issue Mar 19, 2019 · 8 comments

Comments

@chr1sjf0x
Copy link

chr1sjf0x commented Mar 19, 2019

MariaEX: 0.9.1
Elixir: 1.8.1
Erlang: 21.2
Run on Alpine:3.6 (OpenSSL 1.0.2k-fips 26 Jan 2017)

In our deployment, we had an incorrect version of OpenSSL, which resulted in this error:

** (UndefinedFunctionError) function :crypto.hash/2 is undefined (module :crypto is not available)
    (crypto) :crypto.hash(:sha, "REDACTED")
    (mariaex) lib/mariaex/protocol.ex:1097: Mariaex.Protocol.mysql_native_password/2
    (mariaex) lib/mariaex/protocol.ex:212: Mariaex.Protocol.handle_handshake/3
    (mariaex) lib/mariaex/protocol.ex:171: Mariaex.Protocol.handshake_recv/2
    (db_connection) lib/db_connection/connection.ex:66: DBConnection.Connection.connect/2
    (connection) lib/connection.ex:622: Connection.enter_connect/5
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3

With the auto-recovery feature of our environment, repeated attempts to connect to the database resulted in our AWS RDS running out of connections:

** (RuntimeError) connect raised CaseClauseError exception.The exception details are hidden, as they may contain sensitive data such as database credentials. You may set :show_sensitive_data_on_connection_error to true if you wish to see all of the details
    (mariaex) lib/mariaex/protocol.ex:1245: Mariaex.Protocol.abort_statement/3
    (mariaex) lib/mariaex/protocol.ex:232: Mariaex.Protocol.handle_handshake/3
    (mariaex) lib/mariaex/protocol.ex:171: Mariaex.Protocol.handshake_recv/2
    (db_connection) lib/db_connection/connection.ex:66: DBConnection.Connection.connect/2
    (connection) lib/connection.ex:622: Connection.enter_connect/5
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3

Seems failures in the first code path to not result in proper termination of connections.

@chr1sjf0x
Copy link
Author

Maybe a similar root cause as issue 230.

@derekkraan
Copy link

We had DB troubles this morning, resulting in the following error, which bubbled up and crashed our app eventually (even though we have designed the app to be able to function in the absence of the database).

GenServer #PID<0.24339.180> terminating
** (RuntimeError) connect raised CaseClauseError exception.The exception details are hidden, as they may contain sensitive data such as database credentials. You may set :show_sensitive_data_on_connection_error to true if you wish to see all of the details
    (mariaex) lib/mariaex/protocol.ex:1245: Mariaex.Protocol.abort_statement/3
    (mariaex) lib/mariaex/protocol.ex:232: Mariaex.Protocol.handle_handshake/3
    (mariaex) lib/mariaex/protocol.ex:171: Mariaex.Protocol.handshake_recv/2
    (db_connection) lib/db_connection/connection.ex:66: DBConnection.Connection.connect/2

By the time I got show_sensitive_data_on_connection_error turned on, the DB issues were fixed, so unfortunately I don't have more information than this. But I think there is some kind of connection condition causing mariaex to crash, which then in turn causes db_connection to crash (which is not supposed to happen).

@derekkraan
Copy link

I have looked at the code to try to understand what is going on here. handshake_recv/2 is called like so on line 103: handshake_recv(s, %{opts: opts})
Then on line 171 this %{opts: opts} gets sent to handle_handshake/3 as second parameter.
There are three function heads for handle_handshake/3. As far as I can see it's supposed to come out to the one first one, defined on line 202, but it ends up in the third one, defined on line 231.

This is where I lose the plot a little bit, but the case statement on line 1245 is not set up to handle %{opts: opts}, so it crashes.

I can't tell if it's supposed to get to handle_handshake/3 on line 231. Possibly the answer is yes (?). If it is supposed to be able to get there, then there needs to be a case clause added to abort_statement to handle %{opts: opts} as query.

@wojtekmach
Copy link
Contributor

@derekkraan we had a very similar issue on myxql and I believe you're on the right track. I'm able to reproduce it like this:

mysql> set global max_connections = 1; # don't do it on production box!
# run this a few times:
iex>  Mariaex.start_link(database: "myxql_test", show_sensitive_data_on_connection_error: true)

eventually will crash with:

10:26:12.687 [error] GenServer #PID<0.211.0> terminating
** (CaseClauseError) no case clause matching: %{opts: [socket_options: [], sock_type: :tcp, cache_size: 100, timeout: 5000, port: 3306, hostname: "localhost", password: nil, username: "wojtek", database: "myxql_test", show_sensitive_data_on_connection_error: true]}
    (mariaex) lib/mariaex/protocol.ex:1288: Mariaex.Protocol.abort_statement/3
    (mariaex) lib/mariaex/protocol.ex:262: Mariaex.Protocol.handle_handshake/3
    (mariaex) lib/mariaex/protocol.ex:179: Mariaex.Protocol.handshake_recv/2
    (db_connection) lib/db_connection/connection.ex:66: DBConnection.Connection.connect/2
    (connection) lib/connection.ex:622: Connection.enter_connect/5
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: nil
State: Mariaex.Protocol
** (EXIT from #PID<0.193.0>) shell process exited with reason: killed

@derekkraan
Copy link

@wojtekmach thanks for commenting. I will take a look to see if I can write a test case for this scenario for my PR.

@wojtekmach
Copy link
Contributor

Yeah, this is tricky since triggering the condition using max_connections can negatively affect other tests and ideally you'd restore this global state after the test. I couldn't think of another way of testing this scenario so because of this we unfortunately ended up leaving this untested.

@derekkraan
Copy link

The version of mariadb I'm using to test doesn't allow setting max_connections to less than 10 😂

@derekkraan
Copy link

I added a test to the PR based on your idea @wojtekmach. Fails on master, passes in my branch. Thanks for the help, it'll be nice to get this fixed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants