You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We occasionally have "Connection reset" errors in Convergence server logs and 500 Internal Server Error in response to a REST API call.
GET domains/convergence/default/models/ffreport-3605: 500 Server Error
b'{"body":{"error_code":"internal_server_error"},"ok":false}'
Convergence server logs:
INFO: Caught Network I/O errors on odb.pl-grt-orientdb:2424/7048553260737367023, trying an automatic reconnection... (error: Connection reset)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:186) ~[?:?]
at java.net.SocketInputStream.read(SocketInputStream.java:140) ~[?:?]
at java.io.BufferedInputStream.fill(BufferedInputStream.java:252) ~[?:?]
at java.io.BufferedInputStream.read(BufferedInputStream.java:271) ~[?:?]
at java.io.DataInputStream.readByte(DataInputStream.java:270) ~[?:?]
at com.orientechnologies.orient.enterprise.channel.binary.OChannelBinary.readByte(OChannelBinary.java:75) ~[com.orientechnologies.orientdb-core-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.client.binary.OChannelBinaryAsynchClient.beginResponse(OChannelBinaryAsynchClient.java:176) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.client.binary.OChannelBinaryAsynchClient.beginResponse(OChannelBinaryAsynchClient.java:160) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.client.remote.OStorageRemote.beginResponse(OStorageRemote.java:1891) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.client.remote.OStorageRemote.lambda$networkOperationRetryTimeout$2(OStorageRemote.java:248) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.client.remote.OStorageRemote.baseNetworkOperation(OStorageRemote.java:307) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
... 42 more
16:34:35 ERROR RealtimeModelActor - Unexpected error getting model
com.orientechnologies.common.io.OIOException: Connection reset
at com.orientechnologies.orient.client.remote.OStorageRemote.baseNetworkOperation(OStorageRemote.java:351) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.client.remote.OStorageRemote.networkOperationRetryTimeout(OStorageRemote.java:228) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.client.remote.OStorageRemote.networkOperationNoRetry(OStorageRemote.java:261) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.client.remote.OStorageRemote.command(OStorageRemote.java:947) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentRemote.indexQuery(ODatabaseDocumentRemote.java:397) ~[com.orientechnologies.orientdb-client-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.core.index.OIndexRemoteOneValue.get(OIndexRemoteOneValue.java:48) ~[com.orientechnologies.orientdb-core-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.core.index.OIndexRemoteOneValue.get(OIndexRemoteOneValue.java:39) ~[com.orientechnologies.orientdb-core-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.core.index.OIndexAbstractDelegate.get(OIndexAbstractDelegate.java:59) ~[com.orientechnologies.orientdb-core-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.core.index.OIndexTxAwareOneValue.get(OIndexTxAwareOneValue.java:231) ~[com.orientechnologies.orientdb-core-3.0.32.jar:3.0.32]
at com.orientechnologies.orient.core.index.OIndexTxAwareOneValue.get(OIndexTxAwareOneValue.java:38) ~[com.orientechnologies.orientdb-core-3.0.32.jar:3.0.32]
at com.convergencelabs.convergence.server.backend.datastore.OrientDBUtil$.$anonfun$findDocumentFromSingleValueIndex$1(OrientDBUtil.scala:189) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at scala.util.Try$.apply(Try.scala:210) ~[org.scala-lang.scala-library-2.13.2.jar:?]
at com.convergencelabs.convergence.server.backend.datastore.OrientDBUtil$.findDocumentFromSingleValueIndex(OrientDBUtil.scala:185) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at com.convergencelabs.convergence.server.backend.datastore.domain.model.ModelStore$.com$convergencelabs$convergence$server$backend$datastore$domain$model$ModelStore$$findModelDocument(ModelStore.scala:333) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at com.convergencelabs.convergence.server.backend.datastore.domain.model.ModelStore.$anonfun$getModel$1(ModelStore.scala:198) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at com.convergencelabs.convergence.server.backend.db.PooledDatabaseProvider.$anonfun$withDatabase$3(PooledDatabaseProvider.scala:58) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at com.convergencelabs.convergence.server.util.TryWithResource.tryWithResolvedResource(TryWithResource.scala:50) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at com.convergencelabs.convergence.server.util.TryWithResource.com$convergencelabs$convergence$server$util$TryWithResource$$tryWithResource(TryWithResource.scala:36) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at com.convergencelabs.convergence.server.util.TryWithResource$.apply(TryWithResource.scala:78) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at com.convergencelabs.convergence.server.backend.db.PooledDatabaseProvider.$anonfun$withDatabase$1(PooledDatabaseProvider.scala:56) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at scala.util.Success.flatMap(Try.scala:258) ~[org.scala-lang.scala-library-2.13.2.jar:?]
at com.convergencelabs.convergence.server.backend.db.PooledDatabaseProvider.withDatabase(PooledDatabaseProvider.scala:55) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at com.convergencelabs.convergence.server.backend.datastore.AbstractDatabasePersistence.withDb(AbstractDatabasePersistence.scala:32) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at com.convergencelabs.convergence.server.backend.datastore.domain.model.ModelStore.getModel(ModelStore.scala:196) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at com.convergencelabs.convergence.server.backend.services.domain.model.RealtimeModelActor.$anonfun$retrieveModel$3(RealtimeModelActor.scala:325) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at com.convergencelabs.convergence.server.backend.services.domain.model.RealtimeModelActor.$anonfun$retrieveModel$3$adapted(RealtimeModelActor.scala:323) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at scala.util.Success.flatMap(Try.scala:258) ~[org.scala-lang.scala-library-2.13.2.jar:?]
at com.convergencelabs.convergence.server.backend.services.domain.model.RealtimeModelActor.$anonfun$retrieveModel$1(RealtimeModelActor.scala:323) ~[com.convergencelabs.convergence-server-1.0.0-rc.7.jar:1.0.0-rc.7]
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) ~[org.scala-lang.scala-library-2.13.2.jar:?]
at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:671) ~[org.scala-lang.scala-library-2.13.2.jar:?]
at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:430) ~[org.scala-lang.scala-library-2.13.2.jar:?]
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:56) ~[com.typesafe.akka.akka-actor_2.13-2.6.6.jar:2.6.6]
at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:93) ~[com.typesafe.akka.akka-actor_2.13-2.6.6.jar:2.6.6]
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) [org.scala-lang.scala-library-2.13.2.jar:?]
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:94) [org.scala-lang.scala-library-2.13.2.jar:?]
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:93) [com.typesafe.akka.akka-actor_2.13-2.6.6.jar:2.6.6]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:48) [com.typesafe.akka.akka-actor_2.13-2.6.6.jar:2.6.6]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:48) [com.typesafe.akka.akka-actor_2.13-2.6.6.jar:2.6.6]
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) [?:?]
at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) [?:?]
at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) [?:?]
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) [?:?]
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) [?:?]
More details:
We observe no other connectivity issues with Convergence server or OrientDB.
There are no errors or warnings in OrientDB logs.
We have everything deployed to a Kubernetes cluster. Our installation has 1 instance of convergence-seed, 3 instances of convergence-server, and 1 instance of OrientDB.
Maybe you could give us some hints for what might go wrong and tips on what Convergence server settings to tweak. Since GET should be a safe method, should we implement retries on the side of our application that makes REST API calls?
REST API client sometimes gets 500 Internal Server Error from Convergence.
Expected Behavior
If Convergence server has some kind of an OrientDB connection pool which might contain stale connections, it should reconnect when hitting connection reset.
Alternatively, if retries are considered a caller's duty, Convergence server should probably return 503 instead of 500 to signify that the client should make a retry.
The text was updated successfully, but these errors were encountered:
Thanks for reporting. This does seem like a bug. I do believe we are using the built in OrientDB connection pooling capability. In general, we would expect that the connection pool would be keeping the connections alive or fresh.
We are using the OrientDB connection pooling, so this seems like there is an error in the connection pool logic on the OrientDB side. We are in the process of upgrading from 3.0.x to 3.2.x of OrientDB so I am hoping that this will resolve the situation.
As a matter of fact, after we reduced the number of convergence-server instances we run in the Akka cluster from 3 to 1 we basically stopped receiving these "Connection reset" errors. They only happen occasionally (a couple of times in a month) and no longer bother us.
But as you were going to upgrade OrientDB client libraries to 3.2.x (and it seems it hasn't been done yet in 1.0.0-rc.12), I think we should keep this issue open.
Hi!
We occasionally have "Connection reset" errors in Convergence server logs and 500 Internal Server Error in response to a REST API call.
Convergence server logs:
More details:
Maybe you could give us some hints for what might go wrong and tips on what Convergence server settings to tweak. Since GET should be a safe method, should we implement retries on the side of our application that makes REST API calls?
Versions
convergencelabs/convergence-server:1.0.0-rc.7
Describe the Bug
REST API client sometimes gets 500 Internal Server Error from Convergence.
Expected Behavior
If Convergence server has some kind of an OrientDB connection pool which might contain stale connections, it should reconnect when hitting connection reset.
Alternatively, if retries are considered a caller's duty, Convergence server should probably return 503 instead of 500 to signify that the client should make a retry.
The text was updated successfully, but these errors were encountered: