Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Versions after v0.19.1 aren't triggering after the PR check. #198

Open
charlieoleary opened this issue Apr 26, 2020 · 4 comments
Open

Versions after v0.19.1 aren't triggering after the PR check. #198

charlieoleary opened this issue Apr 26, 2020 · 4 comments

Comments

@charlieoleary
Copy link

I'm not sure what happened here, but two days ago our pipelines stopped progressing past our pr phase which uses this resource. Setting the resource specifically to v0.19.1 allows the pipelines to work as expected again.

The only errors I've seen are the following from the ATC and worker nodes:

atc
{"timestamp":"2020-04-24T20:00:44.899463521Z","level":"info","source":"atc","message":"atc.failed-to-write-event","data":{"error":"write tcp 172.30.3.3:8080-\u003e172.30.39.48:63164: write: broken pipe","id":5}}

worker
Apr 25 01:17:58 concourse-worker concourse[12351]: {"timestamp":"2020-04-25T01:17:58.319713097Z","level":"error","source":"guardian","message":"guardian.api.garden-server.get-property.failed","data":{"error":"property does not exist: concourse:exit-status","handle":"43f16f9b-d4b9-48b1-5c40-37db687970b8","session":"3.1.6448"}}

This is using a binary install of Concourse v6.0.0 on Ubuntu 18.04. The timeline of everything breaking coinciding with the release of the new version seems a bit too coincidental, and as mentioned pinning the resource to v0.19.1 allows everything to work normally.

@itsdalmo
Copy link
Contributor

Hi @charlieoleary! Unfortunately I'm not sure what to make of your error messages; the entries are hours apart and only one of them is an actual error which is actually innocuous and is logged all the time. Additionally, the resource seems to work fine in the concourse unit tests that are also running on Concourse 6.0.

I.e., unless we have more to go on I doubt the issues you are seeing is caused by the changes to this resource 🤷‍♂️ Perhaps cycling your workers or recreating the pipeline will fix your issue?

@charlieoleary
Copy link
Author

Hey @itsdalmo, the log entries were from different hours because I discovered the worker error later on in my investigation. Curiously, I actually rebuilt the entire cluster including workers and ATC. I even went through the trouble of completely resetting the database and starting fresh with our pipeline definitions. Each time, using the current version, no pipeline would proceed past the PR step. After pinning back to 0.19.1, everything works as expected. Unpinning again results in broken pipelines.

Nothing else has changed other than the version of this resource. After pinning to 0.19.1 I no longer see any errors in the ATC or worker (beyond some throttling limits on SSM anyway, but that’s unrelated).

@jhosteny
Copy link
Contributor

jhosteny commented Apr 29, 2020

To add some additional information, I believed I was seeing this too on an existing repo. I created a local version of the resource, and started from 0.19.1, cherry picking each change to the head of master. At each change, I added a PR on a new test repo that used the corresponding version of my local build of the resource. I was unable to replicate. Note, however, that the commits / PRs were in the same order. That is, I was avoiding the cause of #26.

When I applied the latest version of this resource to an existing pipeline, I was unable to get an older PR to trigger. I suspect this was due to #26 though. I went back to the officially published resource, then cherry-picked #189 (as well as my latest submodule PR, #200, but that is fairly specific to our use case). That fixed the issue for me.

@itsdalmo
Copy link
Contributor

Thanks for looking into this @jhosteny! Personally I've not had any problems with my own pipelines after the latest release, and I would expect more traffic in this issue if it was a widespread problem. What is the status for you @charlieoleary?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants