Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wathola Tracing for upgrade tests #6219

Merged
merged 12 commits into from
Mar 29, 2022

Conversation

mgencur
Copy link
Contributor

@mgencur mgencur commented Mar 2, 2022

Partially fixes #4481
Fixes points 1. and 2. from #6145 (comment)
The automated reporting of broken traces would be done in a separate PR.

Proposed Changes

  • register Zipkin exporter using trace config from the config-tracing config map in Knative Eventing namespace
  • adds trace instrumentation to these components: wathola-sender, wathola-forwarder, wathola-receiver
  • if config-tracing enables tracing and there's a backend storage (Zipkin or Jaeger) it will send Trace information to the storage
  • it is possible to display a complete Trace in Zipkin UI, going from wathola-sender to wathola-receiver, including the components in between (wathola-forwarder and Eventing core components), example below:
    kafka_source_trace_reduced

Pre-review Checklist

  • At least 80% unit test coverage
  • E2E tests for any new behavior
  • Docs PR for any user-facing impact
  • Spec PR for any new API feature
  • Conformance test for any change to the spec

Release Note


Docs

@knative-prow-robot knative-prow-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/test-and-release Test infrastructure, tests or release labels Mar 2, 2022
@codecov
Copy link

codecov bot commented Mar 2, 2022

Codecov Report

Merging #6219 (9ff8b09) into main (7001b65) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #6219   +/-   ##
=======================================
  Coverage   82.18%   82.18%           
=======================================
  Files         231      231           
  Lines        7787     7787           
=======================================
  Hits         6400     6400           
  Misses        937      937           
  Partials      450      450           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7001b65...9ff8b09. Read the comment docs.

@mgencur mgencur changed the title Wathola tracing main Wathola Tracing Mar 2, 2022
@mgencur
Copy link
Contributor Author

mgencur commented Mar 2, 2022

/retest

2 similar comments
@mgencur
Copy link
Contributor Author

mgencur commented Mar 3, 2022

/retest

@mgencur
Copy link
Contributor Author

mgencur commented Mar 3, 2022

/retest

@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mgencur
To complete the pull request process, please assign lberk after the PR has been reviewed.
You can assign the PR to them by writing /assign @lberk in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mgencur mgencur changed the title Wathola Tracing Wathola Tracing for upgrade tests Mar 3, 2022
@mgencur
Copy link
Contributor Author

mgencur commented Mar 3, 2022

/retest

@mgencur
Copy link
Contributor Author

mgencur commented Mar 3, 2022

The failure in reconciler-tests is unrelated. I will re-run the tests later after getting some feedback on this PR.

@knative-prow-robot knative-prow-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 4, 2022
@knative-prow-robot
Copy link
Contributor

@mgencur: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-knative-eventing-reconciler-tests 7aae60f link false /test pull-knative-eventing-reconciler-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@pierDipi
Copy link
Member

pierDipi commented Mar 7, 2022

/cc @cardil

@cardil
Copy link
Contributor

cardil commented Mar 7, 2022

Thanks, @mgencur, for doing this. I will review it shortly...

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 18, 2022
@mgencur
Copy link
Contributor Author

mgencur commented Mar 21, 2022

@cardil gentle ping. Two weeks have passed...

Copy link
Contributor

@cardil cardil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @mgencur! A lot of ❤️ for doing this.

I found only some minor nits.

tracingEnv corev1.EnvVar
loggingEnv *corev1.EnvVar
TracingCfg string
LoggingCfg string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for this change?

Is it the usage in config.toml? If so, It might have been used there as (only make it public):

tracingConfig = '{{- .TracingEnv.Value -}}'

Copy link
Contributor Author

@mgencur mgencur Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It was for that reason. I somehow find it cleaner if the client holds the config instead of an EnvVar. The function that produces the config is called getTracingConfig. It can then create the EnvVar (which is only done in a single place anyway) or it can pass the config to Wathola. This change doesn't bring any complexity. But I don't have a strong opinion about this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not forcing this to rollback.

test/upgrade/prober/wathola/config/tracing.go Outdated Show resolved Hide resolved
Comment on lines 68 to 70
// Give time to send tracing information.
time.Sleep(5 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this static wait, as it may introduce failures regarding grace period at later time.

It should be possible to know if tracing is already sent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's an easy way to do it. Except for querying back the Zipkin endpoint if the data was stored there (but I wouldn't really like to do this).
There's an open issue in opencensus census-instrumentation/opencensus-go#862
Anyway, the Reporter is created with a default batch interval 1 second. So, it should be enough to wait just 1 second because the data is flushed every 1 second.
Would that be alright? We're already thinking about not shutting down the Sender but using a different way to give it a signal to send the Finished event. So, having it implemented should then relax the need for the 1-second wait time at the end.
And I think waiting 1 second at the end is not too bad - it's a workaround for the missing feature in opencensus.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I think Knative should move to OpenTelemetry as soon as possible. Maybe it will give us greater capabilities as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. There's an open issue for moving to OpenTelemetry. It's been there for a while...

Copy link
Contributor

@cardil cardil Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. It's a different API for exporting "metrics" which we don't need now. And it looks like it can only export those metrics via ReadAndExport. Not the traces :-/ But I'm not 100% sure.

@@ -154,14 +183,22 @@ func (h httpSender) Supports(endpoint interface{}) bool {
}

func (h httpSender) SendEvent(ce cloudevents.Event, endpoint interface{}) error {
return h.SendEventWithContext(context.Background(), ce, endpoint)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line creates a new context object for every event sent.

We should use one context object, and it should be supporting signals (knative.dev/pkg/signals).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Each event requires a new context because the tracing information is stored in it (via the opencensus exporter.
The "main" branch uses context.Background() as well: https://github.com/knative/eventing/blob/main/test/upgrade/prober/wathola/sender/services.go#L162
I could possibly use return h.SendEventWithContext(signals.NewContext(), ce, endpoint) but I am not sure what it brings. The context we're using here is for setting up information that is sent with the event - the event is sent and we're done with that (we're not waiting for a shutdown signal there). When I look at where signals.NewContext() is used it's mainly in long running "main" funcitons for catching signals. But the Sender itself has its own loop for handling signals in SendContinually.
So my changes do not change the original behaviour (we have been using context.Background().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. We could shift to using signals.NewContext() by removing manual signals handling in SendContinually, and then create subcontext for each send. But, you got it right. It's not worth it.

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 28, 2022
@mgencur
Copy link
Contributor Author

mgencur commented Mar 28, 2022

Rebased, made loggingCfg private and reduced the sleep time for sending the tracing info down to 1 second.

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.
@mgencur
Copy link
Contributor Author

mgencur commented Mar 28, 2022

The failures in E2E tests here is unrelated:

2022-03-28T11:52:59.3323318Z === CONT  TestBrokerNamespaceDefaulting
2022-03-28T11:52:59.3324361Z     broker_defaults_webhook_test.go:171: 
2022-03-28T11:52:59.3325579Z         	Error Trace:	broker_defaults_webhook_test.go:171
2022-03-28T11:52:59.3326313Z         	            				wait.go:220
2022-03-28T11:52:59.3326863Z         	            				wait.go:233
2022-03-28T11:52:59.3327386Z         	            				wait.go:660
2022-03-28T11:52:59.3327918Z         	            				wait.go:594
2022-03-28T11:52:59.3328452Z         	            				wait.go:458
2022-03-28T11:52:59.3328986Z         	            				wait.go:443
2022-03-28T11:52:59.3329824Z         	            				broker_defaults_webhook_test.go:163
2022-03-28T11:52:59.3330224Z         	Error:      	Not equal: 
2022-03-28T11:52:59.3330716Z         	            	expected: "PT0.5S"
2022-03-28T11:52:59.3331192Z         	            	actual  : "PT0.2S"
2022-03-28T11:52:59.3331538Z         	            	
2022-03-28T11:52:59.3331912Z         	            	Diff:
2022-03-28T11:52:59.3332621Z         	            	--- Expected
2022-03-28T11:52:59.3333028Z         	            	+++ Actual
2022-03-28T11:52:59.3333508Z         	            	@@ -1 +1 @@
2022-03-28T11:52:59.3333946Z         	            	-PT0.5S
2022-03-28T11:52:59.3334330Z         	            	+PT0.2S
2022-03-28T11:52:59.3334780Z         	Test:       	TestBrokerNamespaceDefaulting
2022-03-28T11:52:59.3340279Z     broker_defaults_webhook_test.go:172: 
2022-03-28T11:52:59.3340885Z         	Error Trace:	broker_defaults_webhook_test.go:172
2022-03-28T11:52:59.3341477Z         	            				wait.go:220
2022-03-28T11:52:59.3342013Z         	            				wait.go:233
2022-03-28T11:52:59.3342534Z         	            				wait.go:660
2022-03-28T11:52:59.3343065Z         	            				wait.go:594
2022-03-28T11:52:59.3343591Z         	            				wait.go:458
2022-03-28T11:52:59.3344104Z         	            				wait.go:443
2022-03-28T11:52:59.3345119Z         	            				broker_defaults_webhook_test.go:163
2022-03-28T11:52:59.3345514Z         	Error:      	Not equal: 
2022-03-28T11:52:59.3346033Z         	            	expected: 5
2022-03-28T11:52:59.3346436Z         	            	actual  : 10
2022-03-28T11:52:59.3346858Z         	Test:       	TestBrokerNamespaceDefaulting

if err != nil {
Log.Warn("Tracing configuration is invalid, using the no-op default", zap.Error(err))
}
if err = tracing.SetupStaticPublishing(Log, "", config); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that maybe returning here the *tracing.OpenCensusTracer allows later to call Finish method.

func SetupStaticPublishing(logger *zap.SugaredLogger, serviceName string, cfg *config.Config) (*tracing.OpenCensusTracer, error) {
	oct := tracing.NewOpenCensusTracer(tracing.WithExporter(serviceName, logger))
	if err := oct.ApplyConfig(cfg); err != nil {
		return nil, fmt.Errorf("unable to set OpenCensusTracing config: %w", err)
	}
	return oct, nil
}

This method should call Close() method of HTTP reporter, which should break the send loop, and try sending the last batch before shutdown. I think it should be enough to send all registered spans.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Let me try it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this whole approach doesn't work because on this line there's nil passed to the function: , and then on this line it throws "panic: runtime error: invalid memory address or nil pointer dereference"
It is also not possible to call tracing.SetupTracing() again (to force calling the reporter Close method) because the tracer is already registered and can't be done again.
As a side note, I've verified that 1.5 seconds is enough for the spans/traces to be sent properly.
I recommend using 1.5 seconds sleep time and call it done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@cardil cardil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Looks great! Thanks @mgencur for doing this and for knative/pkg#2475 as well!

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Mar 29, 2022
Copy link
Member

@pierDipi pierDipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor and non-blocking comment.

/lgtm
/approve

Comment on lines +571 to 573
if c.loggingCfg != "" {
pod.Containers[i].Env = append(pod.Containers[i].Env, corev1.EnvVar{Name: ti.ConfigLoggingEnv, Value: c.loggingCfg})
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can refactor logging and tracing in [1] to be consistent when "" is provided because it looks weird that we're handling the empty string only for the logging config.

[1]:

// ConfigureTracing can be used in test-images to configure tracing
func ConfigureTracing(logger *zap.SugaredLogger, serviceName string) error {
tracingEnv := os.Getenv(ConfigTracingEnv)
if tracingEnv == "" {
return tracing.SetupStaticPublishing(logger, serviceName, config.NoopConfig())
}
conf, err := config.JSONToTracingConfig(tracingEnv)
if err != nil {
return err
}
return tracing.SetupStaticPublishing(logger, serviceName, conf)
}
// ConfigureTracing can be used in test-images to configure tracing
func ConfigureLogging(ctx context.Context, name string) context.Context {
loggingEnv := os.Getenv(ConfigLoggingEnv)
conf, err := logging.JSONToConfig(loggingEnv)
if err != nil {
logging.FromContext(ctx).Warn("Error while trying to read the config logging env: ", err)
return ctx
}
l, _ := logging.NewLoggerFromConfig(conf, name)
return logging.WithLogger(ctx, l)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pierDipi Did you mean something like this? #6289

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the other way around, support in ConfigureLogging an empty loggingEnv so that we can drop if c.loggingCfg != "" { in this PR

@knative-prow
Copy link

knative-prow bot commented Mar 29, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cardil, mgencur, pierDipi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 29, 2022
@knative-prow knative-prow bot merged commit 3890b39 into knative:main Mar 29, 2022
mgencur added a commit to mgencur/eventing that referenced this pull request Apr 20, 2022
* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue
knative-prow bot pushed a commit that referenced this pull request Apr 20, 2022
* Wathola Tracing for upgrade tests (#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (#6289)

* Print traces for missed events in upgrade tests (#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (#6331)

* NPE fix (#6343)

Co-authored-by: Chris Suszynski <[email protected]>
knative-prow-robot pushed a commit to knative-prow-robot/eventing that referenced this pull request Apr 20, 2022
* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue
matzew pushed a commit to matzew/eventing that referenced this pull request Apr 20, 2022
* Wathola Tracing for upgrade tests (knative#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (knative#6289)

* Print traces for missed events in upgrade tests (knative#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (knative#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (knative#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (knative#6331)

* NPE fix (knative#6343)

Co-authored-by: Chris Suszynski <[email protected]>
knative-prow bot pushed a commit that referenced this pull request Apr 20, 2022
* Wathola Tracing for upgrade tests (#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (#6289)

* Print traces for missed events in upgrade tests (#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (#6331)

* NPE fix (#6343)

Co-authored-by: Martin Gencur <[email protected]>
Co-authored-by: Chris Suszynski <[email protected]>
openshift-merge-robot pushed a commit to openshift/knative-eventing that referenced this pull request Apr 21, 2022
* Wathola Tracing for upgrade tests (knative#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (knative#6289)

* Print traces for missed events in upgrade tests (knative#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (knative#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (knative#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (knative#6331)

* NPE fix (knative#6343)

Co-authored-by: Chris Suszynski <[email protected]>

Co-authored-by: Martin Gencur <[email protected]>
Co-authored-by: Chris Suszynski <[email protected]>
openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/knative-eventing that referenced this pull request Apr 21, 2022
* Wathola Tracing for upgrade tests (knative#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (knative#6289)

* Print traces for missed events in upgrade tests (knative#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (knative#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (knative#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (knative#6331)

* NPE fix (knative#6343)

Co-authored-by: Chris Suszynski <[email protected]>
openshift-merge-robot pushed a commit to openshift/knative-eventing that referenced this pull request Apr 21, 2022
* Wathola Tracing for upgrade tests (knative#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (knative#6289)

* Print traces for missed events in upgrade tests (knative#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (knative#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (knative#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (knative#6331)

* NPE fix (knative#6343)

Co-authored-by: Chris Suszynski <[email protected]>

Co-authored-by: Martin Gencur <[email protected]>
Co-authored-by: Chris Suszynski <[email protected]>
hawkli-1994 pushed a commit to katanomi/knative-eventing that referenced this pull request Jul 24, 2023
* Wathola Tracing for upgrade tests (knative#6219)

* wathola exposing trace information

* Run update-deps.sh

* Fix license

* Fix import

* Ensure backwards compatibility

* Assert ParentID not nil in test

* Separate old and new events sender APIs

* Make loggingCfg in client private

* Wait only 1 second for flushing tracing info

The Reporter is created with a default batch interval 1 second. So, it
should be enough to wait just 1 second because the data is flushed every
1 second.

* Increase the sleep time to 1.5 seconds to be safe

* The ticker runs every 100ms so it could be 1100 ms until the buffer
really flushes.

* Use Log.Fatal when tracing is not set up properly

* Increase the sleep time to 5 seconds and reference knative/pkg issue

* Process empty tracing config in test images (knative#6289)

* Print traces for missed events in upgrade tests (knative#6249)

* Upgrade tests reporting Trace information for missed events

* TMP: Induce missed event

* Revert "TMP: Induce missed event"

This reverts commit 2fec7c7.

* Report trace also for Duplicated events

* TMP: Induce missed event

* TMP: Simulate duplicate events

* Fix readme

* Unify path for duplicate and missed events

* Revert "TMP: Simulate duplicate events"

This reverts commit c126521.

* Revert "TMP: Induce missed event"

This reverts commit fcd9185.

* Do not fail upgrade tests if tracing is not configured (knative#6299)

* Do not fail upgrade tests if tracing is not configured

* TMP: Do not deploy Knative Monitoring

* Revert "TMP: Do not deploy Knative Monitoring"

This reverts commit 086a8f9.

* Limit the number of exported traces (knative#6329)

Exporting traces for a large number of events can exceed the timeout of
the whole test suite, leading to all upgrade tests being reported as
failed.

* Cleanup Zipkin tracing only once in upgrade test suite (knative#6331)

* NPE fix (knative#6343)

Co-authored-by: Chris Suszynski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test-and-release Test infrastructure, tests or release lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ApiServerSource should be able to watch resources in other namespaces besides the one in which it resides
4 participants