Delay after bloating test image #6014

coryrc · 2019-11-12T19:57:41Z

Some systems don't prevent memory allocation, but kill any pods that exceed
memory usage after some time. So add a 1-second delay after bloating memory
in the autoscale test image.

Fixes #6007

It's done at this location because it still logs the increase in memory if possible.

Some systems don't prevent memory allocation, but kill any pods that exceed memory usage after some time. So add a 1-second delay after bloating memory in the autoscale test image.

knative-prow-robot

@coryrc: 0 warnings.

In response to this:

Some systems don't prevent memory allocation, but kill any pods that exceed
memory usage after some time. So add a 1-second delay after bloating memory
in the autoscale test image.

Fixes #6007

It's done at this location because it still logs the increase in memory if possible.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

knative-test-reporter-robot · 2019-11-12T20:03:42Z

The following jobs failed:

Test name	Triggers	Retries
pull-knative-serving-unit-tests		0/3

Failed non-flaky tests preventing automatic retry of pull-knative-serving-unit-tests:

pkg/activator/net.TestThrottlerWithError
pkg/activator/net.TestThrottlerWithError/both_requests_time_out

coryrc · 2019-11-13T20:41:48Z

/retest

coryrc · 2019-11-13T22:12:46Z

/assign @vagababov

vagababov · 2019-11-13T22:20:37Z

test/test_images/autoscale/autoscale.go

@@ -173,6 +173,7 @@ func handler(w http.ResponseWriter, r *http.Request) {
 		go func() {
 			defer wg.Done()
 			fmt.Fprint(w, bloat(mb))
+			time.Sleep(time.Second)


What if we already have a delay of 1s or more?

What do you mean? The sleep option? Each bloat/sleep/prime etc runs in parallel, so it won't have any unanticipated effects unless a bloat and a sleep < 1s occur in the same call and the test expects it to get back right away (which does not appear to be the case because everything passes)

I guess. Do you think 1s is enough for things to be killed?

It is for fully-managed Cloud Run. It has no effect on k8s-based platforms.

vagababov · 2019-11-13T23:29:49Z

/lgtm
/approve
/hold
for the question

knative-prow-robot · 2019-11-13T23:30:00Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: coryrc, vagababov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~test/OWNERS~~ [vagababov]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coryrc · 2019-11-14T00:27:45Z

/assign @dgerd

dgerd · 2019-11-14T04:02:47Z

I really don't like the idea of inserting a random sleep to get this to work. In fact I don't like the timing aspect of the test at all, and that the only way to observe failures without timing is to reach down into the Pod. I see two options:

Move the test from conformance to e2e -- We have some coverage of resource.limits through the cgroup runtime test and we don't have anything in our specification on how these limits are enforced.
In our API specification we link out to K8s which says If a Container exceeds its memory limit, it might be terminated. If it is restartable, the kubelet will restart it, as with any other type of runtime failure. I believe any container runtime that is cgroups based is going to see a restart, but given that K8s takes such a light stance here with might restart it I could see moving this to e2e to keep coverage and detect regressions, but remove it from conformance.
Update our specification -- Update our runtime contract to add more details on how memory limits should be enforced. If we want to go this route we will want to take a closer look into how various container runtimes enforce this. Can you ever get more than the limit? How long can it be over the limit?

I don't think it is worth the effort to go down the second path at this time.

coryrc · 2019-11-14T22:26:51Z

Going to go with Dan's request #1 and move it to e2e using this issue: #6006

Delay after bloating test image

bd3da86

Some systems don't prevent memory allocation, but kill any pods that exceed memory usage after some time. So add a 1-second delay after bloating memory in the autoscale test image.

googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Nov 12, 2019

knative-prow-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Nov 12, 2019

knative-prow-robot reviewed Nov 12, 2019

View reviewed changes

knative-prow-robot requested review from tcnghia and vagababov November 12, 2019 19:57

knative-prow-robot added the area/test-and-release It flags unit/e2e/conformance/perf test issues for product features label Nov 12, 2019

knative-prow-robot assigned vagababov Nov 13, 2019

vagababov reviewed Nov 13, 2019

View reviewed changes

knative-prow-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Nov 13, 2019

knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 13, 2019

knative-prow-robot assigned dgerd Nov 14, 2019

coryrc mentioned this pull request Nov 14, 2019

Tighter Acceptance Criteria in TestCustomResources #6006

Closed

coryrc closed this Nov 14, 2019

coryrc deleted the issue6007 branch November 14, 2019 22:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delay after bloating test image #6014

Delay after bloating test image #6014

coryrc commented Nov 12, 2019

knative-prow-robot left a comment

knative-test-reporter-robot commented Nov 12, 2019

coryrc commented Nov 13, 2019

coryrc commented Nov 13, 2019

vagababov Nov 13, 2019

coryrc Nov 13, 2019

vagababov Nov 13, 2019

coryrc Nov 13, 2019 •

edited

Loading

vagababov commented Nov 13, 2019

knative-prow-robot commented Nov 13, 2019

coryrc commented Nov 14, 2019

dgerd commented Nov 14, 2019

coryrc commented Nov 14, 2019

Delay after bloating test image #6014

Delay after bloating test image #6014

Conversation

coryrc commented Nov 12, 2019

knative-prow-robot left a comment

Choose a reason for hiding this comment

knative-test-reporter-robot commented Nov 12, 2019

coryrc commented Nov 13, 2019

coryrc commented Nov 13, 2019

vagababov Nov 13, 2019

Choose a reason for hiding this comment

coryrc Nov 13, 2019

Choose a reason for hiding this comment

vagababov Nov 13, 2019

Choose a reason for hiding this comment

coryrc Nov 13, 2019 • edited Loading

Choose a reason for hiding this comment

vagababov commented Nov 13, 2019

knative-prow-robot commented Nov 13, 2019

coryrc commented Nov 14, 2019

dgerd commented Nov 14, 2019

coryrc commented Nov 14, 2019

coryrc Nov 13, 2019 •

edited

Loading