Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink unions and intersections #41

Open
dlfivefifty opened this issue Sep 12, 2018 · 7 comments
Open

Rethink unions and intersections #41

dlfivefifty opened this issue Sep 12, 2018 · 7 comments

Comments

@dlfivefifty
Copy link
Member

dlfivefifty commented Sep 12, 2018

The following inconsistency feels "wrong":

julia> union(1,2)
2-element Array{Int64,1}:
 1
 2

julia> union(1:1 , 2:2)
2-element Array{Int64,1}:
 1
 2

julia> union(1..1, 2..2)
ERROR: ArgumentError: Cannot construct union of disjoint sets.
Stacktrace:
 [1] _leq_union at /Users/solver/Projects/IntervalSets.jl/src/interval.jl:148 [inlined]
 [2] union(::Interval{:closed,:closed,Int64}, ::Interval{:closed,:closed,Int64}) at /Users/solver/Projects/IntervalSets.jl/src/interval.jl:125
 [3] top-level scope at none:0

Following the success of the lazy Broadcasted approach, I'd propose that

union(a::AbstractInterval, b::AbstractInterval) = UnionDomain(a,b)

where UnionDomain (and IntersectDomain) are lazy:

struct UnionDomain{T, D} <: Domain{T}
   domains::D
end
UnionDomain(a...) = UnionDomain{mapreduce(eltype,promote_type,a),typeof(a)}(a)
in(x, d::UnionDomain) = any(in.(x, d.domains))

as in https://github.com/JuliaApproximation/Domains.jl. This works as follows:

julia> UnionDomain(1..1, 2..2)
a union of 2 domains:
	1.	: 1..1
	2.	: 2..2

julia> 1  UnionDomain(1..1, 2..2)
true

julia> 2  UnionDomain(1..1, 2..2)
true

julia> 1.5  UnionDomain(1..1, 2..2)
false

The current behaviour would be recovered via

julia> Interval(union(1..2, 1.5..3))
1.0..3.0

julia> Interval(union(1..2, 1.5..3))
1.0..3.0

julia> Interval(union(1..1, 2..2))
ERROR: ArgumentError: Cannot construct interval from disjoint union of intervals.
@dlfivefifty
Copy link
Member Author

@ararslan @timholy Please let me know if you think this is a good idea and I can prepare a PR.

@timholy
Copy link
Member

timholy commented Sep 20, 2018

It's an interesting proposal. I guess the main alternative is like sqrt(-3) returning a DomainError, and asking users who want that operation to succeed to call it with arguments under which union is a closed operation:

union(IntervalUnion.((1..2, 3.5..5))...)

One potential advantage of this approach is that I'm not sure you need a separate IntersectDomain---the mathematical concept here is a union of intervals, an "object," rather than an "operation."

@scheinerman
Copy link

Hi Folks. The solution I chose for ClosedIntervals is not to define union and intersect but rather have the operations join and meet. In the context of lattices/partially ordered sets, the join of two object is the smallest object that is above both. For intervals, the smallest interval containing both [1,2] and [3,4] is [1,4]. On the other hand, the meet of two intervals is the largest interval contained in both, and that agrees with set intersection. My two cents.

@dlfivefifty
Copy link
Member Author

dlfivefifty commented Sep 20, 2018

I think the best behaviour is to mimic how union works on collections as much as possible.

For collections, it auto-promotes: union(1,2) works and returns a [1,2], and we do not need to convert 1 and 2 to AbstractVector{Int}s first. Also, union(1..2, [3, 7, 10]) would be supported. In the proposal it would return UnionDomain((1..2, [3,7,10])).

A more general design might be able to include all of the following operations:

  1. union
  2. setdiff
  3. intersection (though I agree this not needed for intervals and the current behaviour is good.)
struct CombinationDomain{T, FF<:Function, DD<:Tuple} <: Domain{T}
   op::FF
   domains::DD
end

in(x, d::CombinationDomain) = d.op(in.(x, d.domains))

const UnionDomain{T,DD} = CombinationDomain{T,typeof(any),DD}
const IntersectDomain{T,DD} = CombinationDomain{T,typeof(all),DD}

setdiff_f(a,b) = a && !b
const SetDiffDomain{T,DD} where DD<:Tuple{A,B} where {A,B} = CombinationDomain{T,typeof(setdiff_f),DD}

@timholy
Copy link
Member

timholy commented Sep 20, 2018

I guess it's a mix. The auto-promotion of numbers to arrays is necessitated because there isn't a way to express the union of two distinct numbers as a number. However, for a discrete set (aka, a collection), the output type tends to be preserved: we don't, for example, return a Set when passed Vectors even though internally we perform that exact conversion temporarily. Here we have a continuous set, so in some sense one can argue we should also try to preserve the output type. In this case, for Intervals there is a useful notion of union that allows one to return another Interval, but whether it succeeds depends on values rather than types.

I can go either way here; I guess the advantage of your approach is that union will always work, whereas mine is perhaps more motivated by backwards compatibility and keeping types contained by default. I do not feel that the points I've made obviously have more merit than yours; indeed, having raised them as potential concerns, I should say I think I slightly lean in the direction you're proposing. EDIT: probably my biggest concern is limiting "type explosion"; I'm not sure I'm happy with UnionDomain, IntersectDomain, CombinationDomain, and SetDiffDomain; if you can figure out how to do everything with IntervalUnion I will be much happier.

Either way, I will happily support you if you feel this is the right direction (and assuming, as I do, that Interval(union(1..2, 1.5..3)) will have the same performance it currently has thanks to all those lovely compiler optimizations).

@scheinerman, I also would not be opposed to join and meet. At one point I thought I had defined boundingbox, but I now suspect I did that in one of my lab's internal packages, and indeed join is a briefer name for the same operation.

@dlfivefifty
Copy link
Member Author

Looking at Base's union code it should actually be possible to jack into the following line:

union(s, sets...) = union!(emptymutable(s, promote_eltype(s, sets...)), s, sets...)

I'm thinking something like this:

eltype(x::AbstractInterval{T}) = Infinitesimal{T}
struct DomainStyle end
emptymutable(s, ::Infinitesimal) = DomainStyle()
union!(::DomainStyle, s...) = UnionDomain(s)

=============

This has the obvious drawback that it is abusing union! to mean something completely different. Perhaps a PR into Base would be more appropriate that replicates Broadcasted, something like this:

abstract type UnionStyle end
struct SetStyle <: UnionStyle end
struct VectorStyle <: UnionStyle end

UnionStyle(::Type{<:AbstractSet}) = SetStyle()
UnionStyle(_) = VectorStyle()

struct Unioned{Style,SV}
   args::SV
end

Unioned(args) = Unioned{combine_union_styles(args), typeof(args)}(args)

union(s...) = materialize(Unioned(s))
union!(x, s...) = materialize!(x, Unioned(s))

@timholy
Copy link
Member

timholy commented Sep 22, 2018

eltype(x::AbstractInterval{T}) = Infinitesimal{T}

As I'm sure you see, that's problematic because eltype(x::AbstractInterval{T}) = T (and should stay that way).

The BroadcastStyle-like solution seems pretty reasonable to me. Is union! easily supportable in general, though? When you're doing this inferrably with two sets I think the result is going to have to be like your UnionDomain above, and that will be immutable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants