Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rhcos-4.14] Forward changes from main branch to 4.14 RHCOS before branching #3564

Merged
merged 10 commits into from
Aug 17, 2023

Conversation

travier
Copy link
Member

@travier travier commented Aug 17, 2023

404fe5a mantle/kola: workaround checkService race condition on systemd 254+
829271a mantle: fix offline detection in testiso tests
a4f25d5 koji: Add functions to check/ensure the build tag
5325854 docs: Update the documentation about kola testiso
f38e1df mantle/kola: simplify logic in ParseDenyListYaml
4840baa kola: support denylist Warn feature
74c6108 gcloud: Enable SEV_SNP_CAPABLE
222404d cmd-generate-release-meta: add hyperv to platform list
698a1ea s390x: add kargs embed area
04ae58a cmd-build: Conditionally change the packing structure of container-image When the previous build exists, use its packing structure otherwise container-encapsulate generates a new one

RishabhSaini and others added 10 commits July 18, 2023 11:24
When the previous build exists, use its packing structure otherwise container-encapsulate
generates a new one
coreos#3344

We bubble denylisted tests with 'Warn: true' option as warnings rather than hard failures:
```
kola -p qemu run --parallel 8 ext.config.ntp.* --output-dir tmp/kola

⏭️  Skipping kola test pattern "ext.config.ntp.chrony.dhcp-propagation":
  👉 coreos/fedora-coreos-tracker#1508
⚠️  Warning kola test pattern "ext.config.ntp.timesyncd.dhcp-propagation", snoozing expired on Jul 20 2023:
  👉 coreos/fedora-coreos-tracker#1508
=== RUN   ext.config.ntp.chrony.coreos-platform-chrony-config
=== RUN   ext.config.ntp.timesyncd.dhcp-propagation
--- PASS: ext.config.ntp.chrony.coreos-platform-chrony-config (27.56s)
--- WARN: ext.config.ntp.timesyncd.dhcp-propagation (90.72s)
        cluster.go:162: Error: Unit kola-runext.service exited with code 1
        cluster.go:162: 2023-07-27T06:24:18Z cli: Unit kola-runext.service exited with code 1
        harness.go:1236: kolet failed: : kolet run-test-unit failed: Process exited with status 1
FAIL, output in tmp/kola
+ rc=0
+ set +x
```

Co-authored-by: Dusty Mabe <[email protected]>
With the introduction of warn: true the logic gets complicated.
This is an attempt to simplify the logic and handle all corner cases.
 - Add check_tag, to check if the tag was added to the build;
 - Add ensure_tag, if the tag is not part of the build, add it.

Signed-off-by: Renata Ravanelli <[email protected]>
The offline detection in our testiso tests is not working properly
meaning we are never running offline tests today. Let's fix that
here by parsing it properly.

Fixes f98481f.
After rawhide was updated to systemd 254+ we started seeing failures in
ppc64le/s390x where checking the ignition-ostree-growfs.service (part
of the `basic/FCOSGrowpart` test would fail because multiple entries
(lines) would be returned in the
`journalctl -o json MESSAGE_ID=39f53479d3a045ac8e11786248231fbf UNIT=ignition-ostree-growfs.service`
call:

```
 --- FAIL: basic (41.40s)
         cluster.go:94: kolet:
 2023-08-16T03:42:44Z kolet: Error getting journalclt output for ignition-ostree-growfs.service: invalid character '{' after top-level value. Out: {"JOB_ID":"68","SYSLOG_IDENTIFIER":"systemd","PRIORITY":"6","__SEQNUM":"384","__SEQNUM_ID":"18eec6fe008c40a9a6cffc2e9397e0b6","CODE_LINE":"796","_TRANSPORT":"journal","TID":"1","__REALTIME_TIMESTAMP":"1692157339095636","_SELINUX_CONTEXT":"kernel","SYSLOG_FACILITY":"3","_SYSTEMD_CGROUP":"/init.scope","_GID":"0","__CURSOR":"s=18eec6fe008c40a9a6cffc2e9397e0b6;i=180;b=4e0b76a0cfa44247ba229413f78e95f0;m=be48b8;t=603021519d254;x=363c4c17ec8cd6cc","_MACHINE_ID":"e5611e01502c4799bde984942694b30c","MESSAGE_ID":"39f53479d3a045ac8e11786248231fbf","_PID":"1","CODE_FILE":"src/core/job.c","_SYSTEMD_UNIT":"init.scope","_SOURCE_REALTIME_TIMESTAMP":"1692157339095627","_EXE":"/usr/lib/systemd/systemd","_CAP_EFFECTIVE":"1ffffffffff","JOB_TYPE":"start","_BOOT_ID":"4e0b76a0cfa44247ba229413f78e95f0","UNIT":"ignition-ostree-growfs.service","_COMM":"systemd","_HOSTNAME":"localhost","INVOCATION_ID":"b49823ed3a4948048ae303deb5351dd1","JOB_RESULT":"done","_SYSTEMD_SLICE":"-.slice","CODE_FUNC":"job_emit_done_message","_CMDLINE":"/init","__MONOTONIC_TIMESTAMP":"12470456","_RUNTIME_SCOPE":"initrd","_UID":"0","MESSAGE":"Finished ignition-ostree-growfs.service - Ignition OSTree: Grow Root Filesystem."}
 {"_SYSTEMD_CGROUP":"/init.scope","SYSLOG_FACILITY":"3","_SYSTEMD_UNIT":"init.scope","CODE_LINE":"796","_EXE":"/usr/lib/systemd/systemd","MESSAGE":"Finished ignition-ostree-growfs.service - Ignition OSTree: Grow Root Filesystem.","UNIT":"ignition-ostree-growfs.service","JOB_ID":"68","_SOURCE_REALTIME_TIMESTAMP":"1692157339095627","_SYSTEMD_SLICE":"-.slice","_GID":"0","_CAP_EFFECTIVE":"1ffffffffff","SYSLOG_IDENTIFIER":"systemd","CODE_FILE":"src/core/job.c","CODE_FUNC":"job_emit_done_message","_COMM":"systemd","_RUNTIME_SCOPE":"initrd","PRIORITY":"6","JOB_RESULT":"done","__SEQNUM":"1341","_TRANSPORT":"journal","JOB_TYPE":"start","INVOCATION_ID":"b49823ed3a4948048ae303deb5351dd1","_UID":"0","__CURSOR":"s=18eec6fe008c40a9a6cffc2e9397e0b6;i=53d;b=4e0b76a0cfa44247ba229413f78e95f0;m=be48b8;t=603021519d254;x=363c4c17ec8cd6cc","_BOOT_ID":"4e0b76a0cfa44247ba229413f78e95f0","_CMDLINE":"/init","__SEQNUM_ID":"18eec6fe008c40a9a6cffc2e9397e0b6","_HOSTNAME":"localhost","_PID":"1","__REALTIME_TIMESTAMP":"1692157339095636","__MONOTONIC_TIMESTAMP":"12470456","TID":"1","_SELINUX_CONTEXT":"kernel","MESSAGE_ID":"39f53479d3a045ac8e11786248231fbf","_MACHINE_ID":"e5611e01502c4799bde984942694b30c"}
     --- FAIL: basic/FCOSGrowpart (0.18s)
```

Apparently it was picking up an entry from /run/log/journal and another
one from /var/log/journal. The entries are almost completely identical,
including __SEQNUM_ID, _SOURCE_REALTIME_TIMESTAMP, etc. Here is one example
diff:

```
[core@cosa-devsh ~]$ diff -ur foo1.json foo2.json
--- foo1.json   2023-08-16 13:51:18.406733458 +0000
+++ foo2.json   2023-08-16 13:51:10.336733458 +0000
@@ -30,9 +30,9 @@
   "_SYSTEMD_UNIT": "init.scope",
   "_TRANSPORT": "journal",
   "_UID": "0",
-  "__CURSOR": "s=a28672494c644ba48c9b909e39681e91;i=17d;b=750a68e68fa045dd884bb1b5a1e40b50;m=72eaf9;t=6030a8488adf8;x=e7da920318b458f1",
+  "__CURSOR": "s=a28672494c644ba48c9b909e39681e91;i=551;b=750a68e68fa045dd884bb1b5a1e40b50;m=72eaf9;t=6030a8488adf8;x=e7da920318b458f1",
   "__MONOTONIC_TIMESTAMP": "7531257",
   "__REALTIME_TIMESTAMP": "1692193568370168",
-  "__SEQNUM": "381",
+  "__SEQNUM": "1361",
   "__SEQNUM_ID": "a28672494c644ba48c9b909e39681e91",
 }
```

It shows only the __SEQNUM and __CURSOR are different.

Suffice it to say that these messages are from the same service run and that
the service didn't run twice.

Let's workaround this by only considering the on disk journal, which would
have been flushed too in early boot, so should have all the logs we need.
@dustymabe dustymabe changed the title Forward changes from main branch to 4.14 RHCOS before branching [rhcos-4.14] Forward changes from main branch to 4.14 RHCOS before branching Aug 17, 2023
@travier
Copy link
Member Author

travier commented Aug 17, 2023

Merging this one now. We'll make a new one if we need more backports.

@travier travier merged commit 9b25d97 into coreos:rhcos-4.14 Aug 17, 2023
5 checks passed
@travier travier deleted the rhcos-4.14-branching-bump branch August 17, 2023 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants