-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ubuntu: Bump to latest LTS 24.04 #281
Conversation
Relates: cilium/image-tools#281 Signed-off-by: Tam Mach <[email protected]>
Relates: cilium/image-tools#281 Signed-off-by: Tam Mach <[email protected]>
Relates: cilium/image-tools#281 Signed-off-by: Tam Mach <[email protected]>
eeac455
to
9226802
Compare
Do you have output for the test failure? |
Yes, it can be found here https://github.com/cilium/cilium/actions/runs/9589413585 |
9226802
to
a946c12
Compare
What is the ubuntu base image used for? It's pretty bad if the verifier test fails with a complexity error, so I'm hesitant to just keep the old version to "fix" it. cc @ti-mo seems like maybe we're using a different clang now? i was under the impression that we bake in our own clang? |
a946c12
to
a843c76
Compare
Relates: cilium/image-tools#281 Signed-off-by: Tam Mach <[email protected]>
Relates: cilium/image-tools#281 Signed-off-by: Tam Mach <[email protected]>
Relates: cilium/image-tools#281 Signed-off-by: Tam Mach <[email protected]>
Relates: cilium/image-tools#281 Signed-off-by: Tam Mach <[email protected]>
Signed-off-by: Tam Mach <[email protected]>
a843c76
to
7a6e713
Compare
This reverts commit 11cfa1a to avoid the below verifier issue in cilium/cilium upstream ``` --- FAIL: TestVerifier/bpf_host (111.35s) --- FAIL: TestVerifier/bpf_host/1 (24.05s) --- FAIL: TestVerifier/bpf_host/2 (21.32s) ... ``` Sample error log for TestVerifier/bpf_host/1 ``` verifier_test.go:244: Error: program tail_nodeport_nat_ingress_ipv4: load program: argument list too long: BPF program is too large. Processed 1000001 insn (1445 line(s) omitted) Verifier error tail: load program: argument list too long: (1436 line(s) omitted) ; struct ipv4_ct_tuple icmp_tuple = { 1201: (63) *(u32 *)(r10 -12) = r1 ; .flags = tuple->flags | TUPLE_F_RELATED, 1202: (71) r1 = *(u8 *)(r10 -123) ; .flags = tuple->flags | TUPLE_F_RELATED, 1203: (44) w1 |= 2 ; struct ipv4_ct_tuple icmp_tuple = { 1204: (73) *(u8 *)(r10 -3) = r1 BPF program is too large. Processed 1000001 insn processed 1000001 insns (limit 1000000) max_states_per_insn 17 total_states 55755 peak_states 1015 mark_read 59 ``` Sample error log for TestVerifier/bpf_host/2 ``` verifier_test.go:244: Error: program tail_nodeport_nat_ingress_ipv4: load program: argument list too long: BPF program is too large. Processed 1000001 insn (1445 line(s) omitted) Verifier error tail: load program: argument list too long: (1436 line(s) omitted) ; struct ipv4_ct_tuple icmp_tuple = { 1201: (63) *(u32 *)(r10 -12) = r1 ; .flags = tuple->flags | TUPLE_F_RELATED, 1202: (71) r1 = *(u8 *)(r10 -123) ; .flags = tuple->flags | TUPLE_F_RELATED, 1203: (44) w1 |= 2 ; struct ipv4_ct_tuple icmp_tuple = { 1204: (73) *(u8 *)(r10 -3) = r1 BPF program is too large. Processed 1000001 insn processed 1000001 insns (limit 1000000) max_states_per_insn 17 total_states 55755 peak_states 1015 mark_read 59 ```
7a6e713
to
ab1f370
Compare
Relates: cilium/image-tools#281 Signed-off-by: Tam Mach <[email protected]>
The failure is actually due to #279. I think the best way is to revert llvm version bump, and proceed with ubuntu upgrade. |
That sucks, I don't like the idea of blindly revering the compiler version, since quite a lot of time has gone into that upgrade. I would like to involve @gentoo-root first before we do that revert. |
Yeah, after I found out about clang version, knowing Max's good work, I would have expected that Max was trying to upgrade in cilium/cilium as per below (which is having same failures as my testing). IMO, it's pretty safe to revert LLVM 1.18 commit, and tackle the verifer issue later. |
Since a previous LLVM version seems to work, I'm not sure about this report: it suggests that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have expected that Max was trying to upgrade in cilium/cilium as per below (which is having same failures as my testing). IMO, it's pretty safe to revert LLVM 1.18 commit, and tackle the verifer issue later.
Ah, I was of the believe that we already were on Clang 18 on cilium/cilium, if thats not the case than we can indeed revert to 17 here, for now. Sorry for the noise
We actually have the error log >1mil statements. Additionally, net-next is actually passed, only 5.10, 5.15 and 6.1 failed. I did a little bit investigation, it seems to be related to ENABLE_IPV6 flag in cilium bpf code, which triggers some complexity issue. Full error log can be found in the below GHA. |
I looked for similar issues in the tracker and this is what I found:
Interestingly,
|
I created https://github.com/cilium/image-tools/tree/ubuntu-22.04 to start the fork from before we merge this, so future fixes for Cilium v1.16 or older can go into that branch. From v1.17 onwards we can start adopting this. |
Relates: cilium/image-tools#281 Signed-off-by: Tam Mach <[email protected]>
Thanks for your input, I would defer this investigation to datapath experts (e.g. @gentoo-root or other datapath team member), as it might take more time for newbie like myself 😅. PS: Kudos to LVH team with awesome steps, local feedback loop is superb. |
Relates: cilium/image-tools#281 Signed-off-by: Tam Mach <[email protected]>
Testing was done in cilium/cilium#33264