Skip to content

fix(tests): Re-enable MQTT and Postgres tests #3163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

PhantomInTheWire
Copy link

@PhantomInTheWire PhantomInTheWire commented Jun 17, 2025

This PR re-enables the previously ignored outbound-postgres and outbound-postgres-variable-permission tests. It also fixes their flakiness.
fixes: #2907
high-level changes:

  • Re-enabled previously ignored flaky tests
  • Added PostgreSQL service to CI
  • Used fixed port 5432 and removed %{} syntax from spin.toml
  • Updated flaky time/datetime assertions

@PhantomInTheWire PhantomInTheWire marked this pull request as draft June 17, 2025 21:53
@PhantomInTheWire PhantomInTheWire force-pushed the fix/postgres-mqtt-tests branch from b7c00c1 to 3913626 Compare June 18, 2025 11:28
@itowlson
Copy link
Collaborator

We had a pass - I kicked off a re-run to see if we can get a flake

@PhantomInTheWire PhantomInTheWire force-pushed the fix/postgres-mqtt-tests branch 2 times, most recently from 9031f2a to fa3e779 Compare June 18, 2025 22:41
@PhantomInTheWire
Copy link
Author

PhantomInTheWire commented Jun 18, 2025

@itowlson I've squashed all necessary changes. For context, all previous runs of all-integration-tests were executed on ubuntu-latest, which allowed them to run on my fork's PRs. However, I've now switched it back to the custom Spin runner.

@PhantomInTheWire PhantomInTheWire marked this pull request as ready for review June 18, 2025 22:45
@itowlson
Copy link
Collaborator

Excellent, thanks! Sorry for the slow turmarounds here, it's making me manually approve each re-run, but I will try to stay across it.

@PhantomInTheWire
Copy link
Author

Everything is in place and all the tests pass, waiting for your review @itowlson

@itowlson
Copy link
Collaborator

Thank you @PhantomInTheWire! I'm planning to re-run the CI a few times to see if the flakes re-appear, so please don't worry if things go back to 'running.' I'll take a look at the changes in the meantime.

@itowlson itowlson requested a review from rylev June 19, 2025 00:41
Copy link
Collaborator

@itowlson itowlson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for looking into this! It all looks good to me but there's a couple of things I'd like to undestand, and I really want to get feedback from @rylev as well. But fingers crossed that you've cracked it...!

@@ -14,5 +14,5 @@ component = "test"

[component.test]
source = "%{source=outbound-postgres}"
allowed_outbound_hosts = ["postgres://{{ pg_host }}:%{port=5432}"]
environment = { DB_URL = "postgres://postgres:postgres@localhost:%{port=5432}/spin_dev" }
allowed_outbound_hosts = ["postgres://{{ pg_host }}:5432"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the changes away from placeholder ports needed? I believe we did this so that test instances could be brought up on ports determined at runtime, and it would be good to keep it. If it's not working, it would be good to understand why. (But if that's involved in the flakiness, we can probably punt on it for now.) cc @rylev for thoughts and insight

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leaving the placeholder ports in place causes a runtime failure. Specifically, the test fails with:

postgres::Connection::open(&address) errored: 'Error::ConnectionFailed("error communicating with the server: Connection reset by peer (os error 54)")'

This happens sometimes unless a fixed port is provided, the dynamic port allocation logic isn’t working reliably in this test/ci context. We could dig deeper into why, but for now, hardcoding the port makes the test pass deterministically.

@@ -191,6 +191,20 @@ jobs:
# run on a larger runner for more SSD/resource access
runs-on: ubuntu-22.04-4core-spin
if: ${{ github.repository_owner == 'spinframework' }}
services:
postgres:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I thought we were dockerising the Postgres instance, but maybe that's been part of the problem. Again, I'd like to get Ryan's input on making it part of the CI image instead (I'm not for or against it, I just want to hear from someone who knows this bit of the system better than I do!).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i think it was not dockerising it right in ci environments, i think it's the same problem with mqtt too, hence the hardcoded service

@itowlson
Copy link
Collaborator

Bother, that's a flake in MQTT. I saw you updated the settings for Postgres; do you think something similar needs to be done for MQTT?

@PhantomInTheWire
Copy link
Author

PhantomInTheWire commented Jun 19, 2025

Bother, that's a flake in MQTT. I saw you updated the settings for Postgres; do you think something similar needs to be done for MQTT?

yeah this is my first time encountering it I think adding a mqtt service to build.yml might fix it but let me try and see if i can figure it out

@PhantomInTheWire PhantomInTheWire force-pushed the fix/postgres-mqtt-tests branch from f1d22de to a1266b6 Compare June 19, 2025 12:11
@PhantomInTheWire
Copy link
Author

PhantomInTheWire commented Jun 19, 2025

Update: I attempted to reproduce the flake on a fork using the ubuntu-latest runner (link), running 20 jobs concurrently across multiple attempts but couldn’t trigger the issue(it did fail once on run # 2 but that was integration_tests::bad_build_test not mqtt).

Could you share how the custom Spin runner is configured? I think that could be a reason

@PhantomInTheWire PhantomInTheWire force-pushed the fix/postgres-mqtt-tests branch 3 times, most recently from fa3e779 to 01fc86f Compare June 19, 2025 16:12
@itowlson
Copy link
Collaborator

Hmm, I don't think I have access to the custom large runner configuration. @spinframework/org-admin can one of you folks help out here?

@PhantomInTheWire PhantomInTheWire force-pushed the fix/postgres-mqtt-tests branch 4 times, most recently from b0278f8 to e9a0850 Compare June 23, 2025 12:51
@PhantomInTheWire PhantomInTheWire force-pushed the fix/postgres-mqtt-tests branch from e9a0850 to 878fa5e Compare June 23, 2025 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Re-enable MQTT and Postgres tests
2 participants