Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surveda crashes when Guisso returns invalid oauth token #2218

Open
ysbaddaden opened this issue Mar 16, 2023 · 8 comments
Open

Surveda crashes when Guisso returns invalid oauth token #2218

ysbaddaden opened this issue Mar 16, 2023 · 8 comments
Labels

Comments

@ysbaddaden
Copy link
Contributor

ysbaddaden commented Mar 16, 2023

When Guisso returns an invalid token (e.g. expires_at is nil instead of an UNIX timestamp), the Broker crashes, taking down the whole Surveda application altogether.

Surveda should be resilient to such issues. In the worst case, only the oauth gen_server should crash yet be restarted by its supervisor (and not crash the whole app). At best, failing to refresh an oauth token should be taken care of: consider the channel down and try again later (probably)? delete the token?

Related to instedd/guisso#83

@ysbaddaden ysbaddaden added the bug label Mar 16, 2023
@ysbaddaden
Copy link
Contributor Author

ysbaddaden commented Mar 28, 2023

I'm reproducing on my local: I have a Verboice channel configured in Surveda on my local environment and I configured Guisso for everything.

If I try to start Surveda before I start Guisso (i.e. which happens all the time, because I always forget to start Guisso), then Surveda crashes because it can't get a valid token on behalf of Verboice. Once I start Guisso, then Surveda won't crash anymore on start.

Verboice on the other hand doesn't have any issue: it simply reports the channel as Unauthorized.

@nthiad
Copy link
Contributor

nthiad commented Apr 10, 2023

What type of channel do you have configured? I setup an smpp channel and have tried many permutations of logging in, logging out, and loading pages with guisso not running and guisso running and I haven't got anything to crash so far. I mostly get redirected to a guisso url that doesn't load when guisso is down but surveda hasn't crashed from that.

@diegoliberman
Copy link
Contributor

@ysbaddaden will provide the steps to reproduce the crash.

@ysbaddaden
Copy link
Contributor Author

I'll try to reproduce more exactly. It was either the QST simulator (SMS) or the Twilio simulator.

@ysbaddaden
Copy link
Contributor Author

What's breaking is the internal OAuth tokens for channels, not the OAuth web flow for connecting the UI to Surveda.

Surveda is requesting guisso for an oauth token for talking with verboice or nuntium, then will regularly refresh it after some minutes (e.g. when restarting surveda after some time).

@diegoliberman
Copy link
Contributor

@nthiad I don't know if the latest comments by Julien, help you to reproduce the issue.

@nthiad
Copy link
Contributor

nthiad commented Apr 21, 2023

I've tried more things and I've been working on other patches but I still haven't been able to reproduce this issue either intentionally or by working on other tickets.

@nthiad nthiad removed their assignment Apr 21, 2023
@ysbaddaden
Copy link
Contributor Author

For future reference: I forgot to start Guisso on my local, and Surveda won't start properly. Here's the stacktrace:

surveda-app-1  | Compiling 188 files (.ex)
surveda-app-1  | Generated ask app
surveda-app-1  | 2023-05-05T10:10:09.172Z [info] SurveyBroker started. Default batch size=10000. Limit per minute=100.
surveda-app-1  | 2023-05-05T10:10:09.173Z [info] FLOIP pusher: started.
surveda-app-1  | 2023-05-05T10:10:09.222Z [info] ChannelStatusServer: started.
surveda-app-1  | 2023-05-05T10:10:09.223Z [info] QuestionnaireSimulator started with simulation_ttl: 5
surveda-app-1  | 2023-05-05T10:10:09.299Z [info] Running AskWeb.Endpoint with cowboy 2.9.0 at 0.0.0.0:80 (http)
surveda-app-1  | 2023-05-05T10:10:09.300Z [info] Access AskWeb.Endpoint at http://app.surveda.lvh.me
surveda-app-1  | 2023-05-05T10:10:09.302Z [info] Running Ask.MetricsEndpoint with cowboy 2.9.0 at 0.0.0.0:9980 (http)
surveda-app-1  | 2023-05-05T10:10:09.303Z [info] Access Ask.MetricsEndpoint at http://localhost:9980
surveda-app-1  | 2023-05-05T10:10:09.303Z [info] Start cancelling survey (id: )
surveda-app-1  | 2023-05-05T10:10:10.184Z [info] FLOIP pusher: Scheduling next run in 15 minutes
surveda-app-1  | 2023-05-05T10:10:10.298Z request_id=F1w3hjXYQh6NPIwAAEVB [info] POST /oauth2/token
surveda-app-1  | 2023-05-05T10:10:10.312Z [error] GenServer Ask.OAuthTokenServer terminating
surveda-app-1  | ** (FunctionClauseError) no function clause matching in DateTime.from_unix/3
surveda-app-1  |     (elixir 1.10.4) lib/calendar/datetime.ex:122: DateTime.from_unix(nil, :second, Calendar.ISO)
surveda-app-1  |     (elixir 1.10.4) lib/calendar/datetime.ex:174: DateTime.from_unix!/3
surveda-app-1  |     (ask 0.31.0) lib/ask/o_auth_token.ex:24: Ask.OAuthToken.from_access_token/2
surveda-app-1  |     (ask 0.31.0) lib/ask/o_auth_token_server.ex:44: Ask.OAuthTokenServer.refresh/3
surveda-app-1  |     (ask 0.31.0) lib/ask/o_auth_token_server.ex:26: Ask.OAuthTokenServer.handle_call/3
surveda-app-1  |     (stdlib 3.12.1.2) gen_server.erl:661: :gen_server.try_handle_call/4
surveda-app-1  |     (stdlib 3.12.1.2) gen_server.erl:690: :gen_server.handle_msg/6
surveda-app-1  |     (stdlib 3.12.1.2) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
surveda-app-1  | Last message (from #PID<0.1843.0>): {:get_token, "verboice", "http://web.verboice.lvh.me", 1}
surveda-app-1  | State: []
surveda-app-1  | Client #PID<0.1843.0> is alive
surveda-app-1  |
surveda-app-1  |     (stdlib 3.12.1.2) gen.erl:167: :gen.do_call/4
surveda-app-1  |     (elixir 1.10.4) lib/gen_server.ex:1020: GenServer.call/3
surveda-app-1  |     (ask 0.31.0) lib/ask/runtime/verboice_channel.ex:145: Ask.Runtime.VerboiceChannel.create_client/2
surveda-app-1  |     (ask 0.31.0) lib/ask/runtime/verboice_channel.ex:23: Ask.Runtime.VerboiceChannel.new/1
surveda-app-1  |     (ask 0.31.0) lib/ask/runtime/session.ex:97: Ask.Runtime.Session.timeout/1
surveda-app-1  |     (ask 0.31.0) lib/ask/runtime/survey_broker.ex:79: anonymous fn/1 in Ask.Runtime.SurveyBroker.retry_respondent/1
surveda-app-1  |     (ecto_sql 3.9.2) lib/ecto/adapters/sql.ex:1203: anonymous fn/3 in Ecto.Adapters.SQL.checkout_or_transaction/4
surveda-app-1  |     (db_connection 2.4.3) lib/db_connection.ex:1611: DBConnection.run_transaction/4
surveda-app-1  |     lib/mut.ex:262: Mutex.apply_with_lock/3
surveda-app-1  |     (elixir 1.10.4) lib/enum.ex:783: Enum."-each/2-lists^foreach/1-0-"/2
surveda-app-1  |     (elixir 1.10.4) lib/enum.ex:783: Enum.each/2
surveda-app-1  |     (elixir 1.10.4) lib/enum.ex:783: Enum."-each/2-lists^foreach/1-0-"/2
surveda-app-1  |     (elixir 1.10.4) lib/enum.ex:783: Enum.each/2
surveda-app-1  |     (ask 0.31.0) lib/ask/runtime/survey_broker.ex:51: Ask.Runtime.SurveyBroker.handle_info/3
surveda-app-1  |     (stdlib 3.12.1.2) gen_server.erl:637: :gen_server.try_dispatch/4
surveda-app-1  |     (stdlib 3.12.1.2) gen_server.erl:711: :gen_server.handle_msg/6
surveda-app-1  |     (stdlib 3.12.1.2) proc_lib.erl:249: :proc_lib.init_p_do_apply/3

The Survey Broker needs to communicate with verboice (to cancel some calls if I'm reading this correctly). For that it needs to refresh the access token from Guisso, but Guisso isn't started so it fails.

In my log, I have a second attempt, and then nothing. The Erlang process doesn't exit, but I can't reach Surveda from the Web. It's left in a zombie-like state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants