Skip to content
This repository has been archived by the owner on Nov 7, 2022. It is now read-only.

Flaky connection while directly exporting to collector from ocagent exporter #582

Open
asutoshpalai opened this issue Jun 18, 2019 · 3 comments

Comments

@asutoshpalai
Copy link
Contributor

asutoshpalai commented Jun 18, 2019

When using the Agent exporter with OpenCensus to directly export to the Collector, instead of exporting to Agent and then to Collector, the connection keeps resetting.

Is this the intended behavior? If so, is there a way/config to maintain stable connection?

As per the blog and design doc, it looks like both the Agent and Collector are optional and we should be able to export directly to the collector.

The bug reproduction

I modified example/main.go to enable debug logs from gRPC as follows:

diff --git a/example/main.go b/example/main.go
index 5fa9f5f..e7932df 100644
--- a/example/main.go
+++ b/example/main.go
@@ -24,13 +24,17 @@ import (
 	"time"
 
 	"contrib.go.opencensus.io/exporter/ocagent"
+	"github.com/sirupsen/logrus"
 	"go.opencensus.io/stats"
 	"go.opencensus.io/stats/view"
 	"go.opencensus.io/tag"
 	"go.opencensus.io/trace"
+	"google.golang.org/grpc/grpclog"
 )
 
 func main() {
+	logrus.SetLevel(logrus.DebugLevel)
+	grpclog.SetLogger(logrus.New())
 	oce, err := ocagent.NewExporter(
 		ocagent.WithInsecure(),
 		ocagent.WithServiceName(fmt.Sprintf("example-go-%d", os.Getpid())))
@@ -119,5 +123,6 @@ func main() {
 		}
 		stats.Record(ctx, mLatencyMs.M(latencyMs))
 		fmt.Printf("Latency: %.3fms\n", latencyMs)
+		oce.Flush()
 	}
 }

My Agent config:

receivers:
  opencensus:
    address: ":55678"

exporters:
  opencensus:
    endpoint: "localhost:55680"

zpages:
  port: 8884

My Collector config:

log-level: DEBUG
receivers:
  opencensus:
    port: 55680

queued-exporters:
  jaeger-all-in-one:
    num-workers: 4
    queue-size: 100
    retry-on-failure: true
    sender-type: jaeger-thrift-http
    jaeger-thrift-http:
      collector-endpoint: http://localhost:14268/api/traces
      timeout: 5s

zpages:
  port: 8889

When all three are run, we get

INFO[0000] pickfirstBalancer: HandleSubConnStateChange: 0xc000020290, CONNECTING 
INFO[0000] pickfirstBalancer: HandleSubConnStateChange: 0xc000020290, READY 

only once in the logs of example/main.go.

But if we don't run the agent and export directly to Collector (by changing the port in the Collector's config), we get the above the above logs multiple times.

@pjanotti
Copy link

Thanks for reporting @asutoshpalai. This is due to the collector not implementing the metrics endpoint: the example periodically tries to send metric data and that resets the connection. The agent on the other hand implements the metrics endpoint and the reset doesn't happen. What happens if you remove the metrics from the example and go straight to the collector? Is that an option for you? That said it is a bug anyway...

@asutoshpalai
Copy link
Contributor Author

Thanks @pjanotti, that's correct! When I didn't register the exporter with view, everything worked fine. It's good enough for me, but I will leave this issue open if you are looking to fix this in future.

@pjanotti
Copy link

Thanks for confirming @asutoshpalai - yes, this is a bug that needs to be fixed. Leaving the issue open.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants