By-length slicing with sentinel in first slice doesn't do a by-length slice #22371

dweiller · 2024-12-31T07:30:01Z

Zig Version

0.14.0-dev.2571+01081cc8e

Steps to Reproduce and Observed Behavior

Put this snippet in by-length.zig

test {
    var foo: [2:0] = .{ 1, 2 };
    _ = foo[0.. :0][0..2];
}

and (with a debug build of the compiler) run

zig ast-check -t by-length.zig` | grep "slice_"

to see that the above zig code is producing slice_sentinel and slice_end ZIR nodes rather than a slice_length ZIR node.

Expected Behavior

I believe the above pattern should produce a slice_length ZIR node. This behaviour was changed in #22344 which disabled the by-length slice optimisation when performing a slice such as a[0.. :sentinel][0..len]. From the description of the PR it's unclear whether this was intended or not as the mentioned motivation for the change was to prevent allowing nonsense like

test {
    var foo: [2:0] = .{ 1, 2 };
    _ = foo[0.. :42][0..2];
}

Unfortunately I didn't document in the original by-length slice PR the reason why a sentinel in the first slice above was previously ignored, but if I remember correctly it was because it doesn't have any bearing on the result or type of the whole slicing expression, and I suppose I didn't consider that the above snippet would have needed an extra sentinel check.

If we want the allow

test {
    var foo: [2:0] = .{ 1, 2 };
    _ = foo[0.. :0][0..2];
}

To benefit from by-length slicing then we will need to rework the AstGen fix from #22344 to produce ZIR like it did previously and rather add extra sentinel checks in Sema when either when by_length is true in analyzeSlice or in zirSliceLength. As a side note it may be worth seeing if there is way to to simplify analyzeSlice as I think it's getting a bit hard to follow (especially with the mentioned sentinel checks added).

If the plan is to disable by-length slicing in the above example then it may be worth mentioning this edge case in the language reference and 0.14 release notes, but I believe this would be an unnecessary edge case in the language.

cc @Techatrix

The text was updated successfully, but these errors were encountered:

Techatrix · 2024-12-31T08:12:18Z

I was aware that it would no longer emit .slice_length for a[0.. :sentinel][0..len]. I did not think much of it since I doubt that code like this gets written outside of compiler tests.

To verify my previous statement, I modified AstGen to panic when encountering a[0.. :sentinel][0..len] and found exactly zero occurrences of this pattern in the source code of Zig, Bun, Mach, ZLS, Tigerbeetle and Ghostty. The exception being test/behavior/slice.zig. @dweiller are you aware of any usage of this pattern?

Patch to detect a[0.. :sentinel][0..len]

diff --git a/lib/std/zig/AstGen.zig b/lib/std/zig/AstGen.zig
index 6e424b597f..f396efcde3 100644
--- a/lib/std/zig/AstGen.zig
+++ b/lib/std/zig/AstGen.zig
@@ -882,10 +882,17 @@ fn expr(gz: *GenZir, scope: *Scope, ri: ResultInfo, node: Ast.Node.Index) InnerE
         => {
             const full = tree.fullSlice(node).?;
             if (full.ast.end != 0 and
-                node_tags[full.ast.sliced] == .slice_open and
+                (node_tags[full.ast.sliced] == .slice_open or node_tags[full.ast.sliced] == .slice_sentinel) and
                 nodeIsTriviallyZero(tree, full.ast.start))
-            {
-                const lhs_extra = tree.sliceOpen(full.ast.sliced).ast;
+            blk: {
+                const lhs_extra = tree.fullSlice(full.ast.sliced).?.ast;
+                if (node_tags[full.ast.sliced] == .slice_sentinel) {
+                    if (tree.sliceSentinel(full.ast.sliced).ast.end == 0) {
+                        @panic("bad");
+                    } else {
+                        break :blk; // a[0..end][0..len] or a[0..end :sentinel][0..len]
+                    }
+                }
 
                 const lhs = try expr(gz, scope, .{ .rl = .ref }, lhs_extra.sliced);
                 const start = try expr(gz, scope, .{ .rl = .{ .coerced_ty = .usize_type } }, lhs_extra.start);

build/stage4-fmt/bin/zig fmt --check --ast-check ~/repos/zig/src ~/repos/zig/lib ~/repos/bun ~/repos/mach ~/repos/zls ~/repos/tigerbeetle/ ~/repos/ghostty/ 2> /dev/null

AFAICT, the .slice_length case is only meant as an code optimization that causes no semantic changes. I don't see the benefit of putting in effort to optimize for non existent code.

dweiller · 2024-12-31T08:52:12Z

I'm not particularly aware of real-world usecases - the place noticed it was in the behaviour tests for slicing. I would also expect that this pattern is rare in real code, the reason being that a sentinel in the first slicing operation can't affect the result/type of the expression and provides no useful information to the compiler (other than a clue that the author may have made an error). I would imagine the way this pattern would emerge in a code base is a pre-existing, open sentinel-terminated slice gets reworked to be done by-length by adding the second slice operation.

AFAICT, the .slice_length case is only meant as an code optimization that causes no semantic changes. I don't see the benefit of putting in effort to optimize for non existent code.

Yes - it shouldn't have semantic effects, the semantic effects should be in moving from s[a.. a + len] to s[a..][0..len]. It just seemed odd to me to disable the optimization in this circumstance, but I haven't looked into how much complication would be needed in Sema to re-enable it. If it is only an optimization for the compiler I would agree that it's probably not worth changing. If it's going to affect generated code I would rather re-enable the optimization (or simply making s[a.. :b][0..len] a compile error) to remove an unnecessary performance edge case. I'm less familiar with the backends though, so I'd have to read through the code to figure out how/if it affects generated code.

dweiller added the bug Observed behavior contradicts documented or intended behavior label Dec 31, 2024

dweiller mentioned this issue Dec 31, 2024

disallow slicing many-item pointer with different sentinel (rev. 2) #22372

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

By-length slicing with sentinel in first slice doesn't do a by-length slice #22371

By-length slicing with sentinel in first slice doesn't do a by-length slice #22371

dweiller commented Dec 31, 2024 •

edited

Loading

Techatrix commented Dec 31, 2024

dweiller commented Dec 31, 2024 •

edited

Loading

By-length slicing with sentinel in first slice doesn't do a by-length slice #22371

By-length slicing with sentinel in first slice doesn't do a by-length slice #22371

Comments

dweiller commented Dec 31, 2024 • edited Loading

Zig Version

Steps to Reproduce and Observed Behavior

Expected Behavior

Techatrix commented Dec 31, 2024

dweiller commented Dec 31, 2024 • edited Loading

dweiller commented Dec 31, 2024 •

edited

Loading

dweiller commented Dec 31, 2024 •

edited

Loading