Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add std.error_set helpers (containsAll,Excluding,Intersect) #19092

Closed
wants to merge 3 commits into from

Conversation

rohlem
Copy link
Contributor

@rohlem rohlem commented Feb 26, 2024

As per recommendation by @squeek502 in #19008 (comment) ,
this PR adds utiliy helper functions in a new module std.error_set to the standard library, which closes #19008 .
The naming of std.error_set IMO makes std.error_set.contains(Set, err) sound more natural than error_sets.contains/contained,
though precedent in std.enums would have lead to std.error_sets.

I've written it the way I privately write code, with explicit asserts for invalid inputs and same-line comments,
so it's clear which incorrect usage scenarios were considered.
From what I remember other places in std have kept assertions at a minimum,
so feel free to remove / tell me to remove any you deem unnecessary.

Also note that the current implementations of Excluding and Intersect are based on comptime-executed inline for.
With accepted proposal #2473 implemented, we could instead more directly communicate them to the compiler:

pub fn Excluding(comptime BaseErrorSet: type, comptime ToExcludeErrorSet: type) type {
  return @TypeOf(switch(@as(BaseErrorSet, undefined)) {
    ToExcludeErrorSet => unreachable, //blocked by #2473
    else => |e| e;
  });
}
pub fn Intersect(comptime ErrorSetA: type, comptime ErrorSetB: type) type {
  return @TypeOf(switch(@as(ErrorSetA, undefined)) {
    ErrorSetB => |e| e, //blocked by #2473
    else => unreachable;
  });
}

Additionally, the status-quo type system doesn't have a way to express a restricted non-exhaustive error set,
so Excluding(anyerror, error{A}) can't express that error.A is excluded and has to return anyerror again.
This alternative implementation would however not be coupled with std.builtin.Type structure and be naturally compatible with such upgrades.

lib/std/error_set.zig Outdated Show resolved Hide resolved
@squeek502 squeek502 force-pushed the PR-std-error_sets branch 2 times, most recently from 14b47f9 to ce74e68 Compare February 29, 2024 06:59
Copy link
Member

@Vexu Vexu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To match std.enums maybe rename it to std.error_sets?

@squeek502
Copy link
Collaborator

Resolved the conflicts but left the name alone for now. From the OP:

The naming of std.error_set IMO makes std.error_set.contains(Set, err) sound more natural than error_sets.contains/contained, though precedent in std.enums would have lead to std.error_sets.

Unsure which is better.

@squeek502
Copy link
Collaborator

squeek502 commented Mar 27, 2024

It seems that #19414 (most likely, haven't confirmed that's the specific change that matters) caused the code in this PR to start erroring with things like:

lib\std\error_set.zig:59:12: error: comptime dereference requires '[3]builtin.Type.Error' to have a well-defined layout, but it does not.
    return @Type(.{ .ErrorSet = remaining_error_buffer[0..remaining_error_count] });
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I've added a commit that fixes those errors (c05ad17) but I'm not sure it's fully correct, as it seems to be using a reference to a comptime var (that is, since the function returns a type, I believe the scope is implicitly comptime, right?) which seems to go against the advice in this explanation. Would appreciate if you could give the changes in c05ad17 a once-over, @mlugg.

@rohlem
Copy link
Contributor Author

rohlem commented Mar 27, 2024

@squeek502 From my understanding the recommended solution would be to copy the array value to a local const,
by dereferencing the *const [N]T yielded by the comptime-known slicing expression:

const final_errors = remaining_error_buffer[0..remaining_error_count].*;

Now &final_errors would be a pointer to comptime-known const, which should be allowed by the new semantics.
However, I haven't had the chance to test this (the newest ziglang.org download still doesn't include the relevant commits).

I'm also not sure why the newly added commit, which moves the values from one comptime var array to another,
doesn't trigger the same compile error as before.
Maybe the implicit conversion from *const [N]T to []const T somehow prevents the check - that seems worth looking into.

@squeek502
Copy link
Collaborator

const final_errors = remaining_error_buffer[0..remaining_error_count].*;

triggers the comptime dereference requires '[3]builtin.Type.Error' to have a well-defined layout, but it does not. error.

@mlugg
Copy link
Member

mlugg commented Mar 27, 2024

@squeek502 hi! Yes, sorry, you're hitting a latent bug exposed by my recent changes. I'm actively working on rewriting comptime pointer access which should fix the issue. In the meantime, if you have a workaround that works, stick with it and throw a TODO comment there -- I'll open a tracking issue later today.

@Pyrolistical
Copy link
Contributor

Existing std methods call it differenceWith and intersectWith

@rohlem
Copy link
Contributor Author

rohlem commented Apr 12, 2024

@Pyrolistical Thank you for that data point, IMO it makes sense to unify naming.
For me with a less-mathematical background "(set a) excluding (set b)" sounds more natural and obvious (i.e. is what I would look/search for in code) than "the difference of (set a) with (set b)".
That said, I'll happily defer to others' / maintainers' opinions on this.
(Note that we also have the option of providing both names, with one const a = b; aliasing the other.)

@ifreund
Copy link
Member

ifreund commented Apr 12, 2024

+1 for difference from me, It is the only standard name for this operation I know of, at least in english and german ("Differenz").

Edit: Nevermind, I forgot that "complement" is also used in math, as in "the complement of A in B" which is equal to B \ A (the difference of B and A). I don't think "complement" is a good fit here though.

rohlem and others added 3 commits July 28, 2024 19:40
Instead of redefining the error set from scratch just to exclude a few errors, we can define the error set in terms of existing explicit error sets.
@squeek502 squeek502 force-pushed the PR-std-error_sets branch from 19ebd59 to e2fc9c1 Compare July 29, 2024 02:44
@squeek502
Copy link
Collaborator

Rebased + removed the commit that was working around #19452

@mochalins
Copy link
Contributor

+1 on the difference/intersection naming! In addition to precedence in standard library, "difference" feels more precise; I immediately know what a set difference is, it denotes properties of the operation (non-commutativity), and it is also more in-line with operation names such as e.g. union.
While I don't think "Excluding" as a name is that big of a deal, it would at least make me pause and check docs/implementation to make sure that it isn't different operation from set difference (e.g. maybe it is an operation that only returns mutually excluded set elements, e.g. symmetric difference).
I think it is easier to go for the more precise term rather than the subjective standard of natural/obvious (others might say e.g. "Subtracting" is more natural/obvious than "Excluding", and as pointed out above different language speakers might find different terminology more familiar).

@rohlem
Copy link
Contributor Author

rohlem commented Jul 29, 2024

@squeek502 Thanks for doing the update!

It's been a while since writing this, and it now looks a bit unnecessary to me to have Without separate from Excluding just for supporting an error value argument.
We could merge them into a single function that takes anytype and uses either @TypeOf(arg) == type or @typeInfo(@TypeOf(arg)) to differentiate.
Or we could just remove Without, and users have to pass error{A} instead of error.A.
Just a suggestion though for whoever wants to decide these things. I'm happy with any way this gets resolved.

@squeek502
Copy link
Collaborator

I'm happy with any way this gets resolved

Same; I have no strong feelings one way or the other about Without

@andrewrk
Copy link
Member

I don't think it's needed enough to add to the standard library.

@andrewrk andrewrk closed this Sep 17, 2024
@squeek502
Copy link
Collaborator

Do you mean the motivating use-case in e2fc9c1 isn't compelling, or that such a use-case should be handled inline instead of introducing a helper namespace?

@andrewrk
Copy link
Member

I think the use case isn't compelling because listing the error set explicitly is easy to maintain and encourages auditing and trimming down of error codes. When an error set becomes too big, it's better to introduce categories and diagnostics structs rather than amassing an enormous error set and then trying to use set theory to manage it.

@squeek502
Copy link
Collaborator

squeek502 commented Sep 17, 2024

I'm not sure how that would help in this situation. The problem with SelfExePathError as it is currently is that if an error gets removed from ReadLinkError/RealPathError, there will be no indication that SelfExePathError should be updated, so it may just indefinitely contain an error that can never be returned. With e2fc9c1, relevant changes to the error sets in question will either be propagated automatically, or they will cause a compile error, which seems ideal.

Unless I'm thinking about it wrong, moving to a diagnostic pattern would just shift the problem from an error set to an enum or a more nebulous type, so I'm not sure it's a solution. In other words, as long as selfExePath is implemented in terms of other functions, then the possible ways that selfExePath can fail are going to necessarily be related to those functions. Making it nicer to model that relationship using error sets seems like a good idea to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A way to narrow error set types (aka the opposite of ||)
8 participants