Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Ix laws #119

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Fix Ix laws #119

wants to merge 1 commit into from

Conversation

Topsii
Copy link

@Topsii Topsii commented Sep 20, 2020

Changed the tests in Tests.QuickCheck.Classes.Ix to match
the laws in the documentation of Data.Ix

  • In particular ixRangeIndex was incorrect:

    *> import Data.Ix
    *> x11 = (1,1) :: (Int,Int)
    *> x33 = (3,3) :: (Int,Int)
    *> x20 = (2,0) :: (Int,Int)
    *> x11 <= x33 && (x20 >= x11 && x20 <= x33)
    True
    (0.01 secs, 960,736 bytes)
    *> index (x11, x33) x20
    *** Exception: Error in array index
    *> inRange (x11, x33) x20
    False
    
  • I'm confused by the dos in the code. Someone more experienced with QuickCheck than me should check if it is really ok to remove them.

  • I also removed the universal precondition that the lower bound is smaller than the upper bound, as it is not a documented law in Data.Ix.

While Ord is a superclass of Ix the documented laws do not depend on Ord. Although the documentation introduces Ix as

mapping a contiguous subrange of values in a type onto integers.

In my mind this means that an instance of Ord orders the inhabitants of the type in a sequence. An instance of Ix drops some elements from that sequence (like x20 in the example above) but retains the order of the remaining elements such that it is a subsequence.
Hence I think the following law captures the relation between Ord and Ix:

inRange (l,u) i && inRange (l,u) j ==> (index (l,u) i <= index (l,u) j) == (i <= j)

Further the documentation in Data.Ix continues

The first argument (l,u) of each of these operations is a pair specifying the lower and upper bounds of a contiguous subrange of values.

From this I derive the law:

inRange (l,u) i ==> l <= i && i <= u

Though I have not added any such implicit law from Data.Ix in this commit.

Changed the tests in Tests.QuickCheck.Classes.Ix to match
the laws in the documentation of Data.Ix

In particular ixRangeIndex was incorrect:

*> import Data.Ix
*> x11 = (1,1) :: (Int,Int)
*> x33 = (3,3) :: (Int,Int)
*> x20 = (2,0) :: (Int,Int)
*> x11 <= x33 && (x20 >= x11 && x20 <= x33)
True
(0.01 secs, 960,736 bytes)
*> index (x11, x33) x20
*** Exception: Error in array index
*> inRange (x11, x33) x20
False
@Topsii
Copy link
Author

Topsii commented Sep 21, 2020

Let me also present an otherwise lawful Ix instance where the law on the bounds is violated:
Ix instance for ordered pairs without duplicates
Consider an Ix instance for an ordered pair of two elements of the same type. Additionally the pair must not contain two equal elements. Unfortunately following the usual derivation scheme for Ix from the Haskell Standard would yield

range (minBound, maxBound) /= [ minBound .. maxBound ]

For example let the element type be a digit (number from 0-9), then

range (Pair 0 1, Pair 9 8)  ==  [ Pair 0 1  /=  [ Pair 0 1  ==  [ Pair 0 1 .. Pair 9 8 ]
                                , Pair 0 2      , Pair 0 2
                                , ...           , ...     
                                , Pair 0 8      , Pair 0 9
                                , Pair 1 2      , Pair 1 0
                                , Pair 1 3      , Pair 1 2
                                , ...           , ...     
                                , Pair 9 7      , Pair 9 7
                                , Pair 9 8      , Pair 9 8
                                ]               ]         

This is because the Standard Ix derivation scheme derives the bounds (0,9) and (1,8) for the first and second element of the pair respectively.

If you are naive like me you might want to 'fix' this by defining

extendBounds :: Ord a => (Pair a, Pair a) -> (a, a)
extendBounds (Pair l1 l2, Pair u1 u2) = (min l1 l2, max u1 u2)

and using the extended bounds (0,9) for both elements of the pair. This yields a lawful Ix instance except for the aforementioned implicit law on the bounds inRange (l,u) i ==> l <= i && i <= u. In particular it can be that the range (u, l) with u < l contains elements. For example range (Pair 1 0, Pair 1 0) evaluates to [ Pair 0 1, Pair 1 0 ]. Even more criminal range (Pair 2 0, Pair 1 0) yields the same nonempty list such that the lower bound is not even in the range. Nonetheless every other law is satisfied. And now comes the dilemma: The supposedly criminal instance arguably actually obeys the documentation in Data.Ix because 'specifying' is a rather vague term.

The first argument (l,u) of each of these operations is a pair specifying the lower and upper bounds of a contiguous subrange of values.

Arguably another class Ix1 (a misleading name, as it is not similar to Eq1 or Ord1) is more appropriate for such a Pair type:

class Ix1 f where
  range :: (a, a) -> [ f a ]
  index :: (a, a) -> f a -> Int
  inRange :: (a, a) -> f a -> Bool
  rangeSize :: (a, a) -> Proxy f -> Int

However this places restrictions on the possible derivation schemes. In what way, number and order may different (field-)types a, b, (f a), (g a) be composed to admit a generic derivation of an (efficient/intuitive/obvious) Ix1/Ix instance for the composed type given certain Ix1/Ix instances for the base (field-)types? At least for Ix we know about the generic deriving scheme for single constructor types from the Haskell Standard where the order of fields is arbitrary and their types can be different. This tempts me to just go with the Ix instance with extended bounds.

You may consider this a contrived instance and I certainly agree this Ix instance with extended bounds is not as intuitive as it should be. Nonetheless it seems useful enough to me to use it in my project. There I would otherwise have to use Ix2 (not Ix1) and would still be lacking a generic deriving mechanism.

If you think of adding the aforementioned implicit laws I hope you consider such unusual Ix instances.
I think explicitely allowing or disallowing such instances is both reasonable (I tend to disallow), but alas the current documentation in Data.Ix does not address this issue.

If you want to allow/support such instances for now, then the precondition l <= u skips relevant tests. Hence I removed it.

Meanwhile I'm thankful that your code helped me uncover that the rangeSize default method of my instance violated the Ix laws! :)

@andrewthad
Copy link
Owner

Thoughts:

  • Thanks for doing this! I'm glad this library was able to help you find something wrong in your code.
  • Concerning the l <= u preconditions, I had always assumed that Ix's Ord superclass implied some kind of relationship between the Ord methods and the Ix method. But the Ix instance for (a,b) convinces me that there is no such relationship and that the superclass is actually spurious. Sad.
  • The do notation was not needed, so removing that is fine.

In test/Spec.hs, there is a allLaws function that's used for integral types. Can you add the Ix tests to that and confirm that the test suite still passes with your changes?

@Topsii
Copy link
Author

Topsii commented Sep 21, 2020

In test/Spec.hs, there is a allLaws function that's used for integral types. Can you add the Ix tests to that and confirm that the test suite still passes with your changes?

The test suite hangs forever (several hours before I stopped) when it tests ixLaws for Int64. Testing Int32 hangs as well. Int16 passes as expected. Similarly Word64 and Word32 hang while Word16 passes. Int and Word both pass. Every test contains a call of range. I could imagine that being relevant.

  • Concerning the l <= u preconditions, I had always assumed that Ix's Ord superclass implied some kind of relationship between the Ord methods and the Ix method. But the Ix instance for (a,b) convinces me that there is no such relationship and that the superclass is actually spurious. Sad.

Hmmm, at first I was surprised you think there is no relationship between Ord and Ix after what I wrote. Then I realized I misread subrange as subsequence. Basically my understanding of Ix was

Ix is used to map a subsequence of values in a type onto a contiguous subrange of integers starting at 0

instead of the actual documentation

Ix is used to map a contiguous subrange of values in a type onto integers.

My wishful interpretation made so much sense to me. The Ix instance for (a,b) is not mapping a contiguous (sub-)range of values unless you specifically think of a grid of values.

So I basically thought we had range (l,u) == sort (range (l,u)). I'd still like to think this is one of the laws. But I'm worried about all this ambiguity in the documentation. What exactly is a subrange? I guess this question is partially answered by the return value of range (l,u). Clearly it is bounded by a lower and an upper bound. So that justifies not calling it a subset/subsequence. But is it ordered in a meaningful way ('contiguous' bounded subsequence) or not ('contiguous' bounded subset)?

And here is another attempt at translating the documentation

The first argument (l,u) of each of these operations is a pair specifying the lower and upper bounds of a contiguous subrange of values.

into laws such that they are satisfiable by the Ix instance for unordered pairs without duplicates from my second message

rangeSize (l,u) > 0  &&  rangeSize (l,m) > 0    ==>    head (range (l,u))  ==  head (range (l,m))
rangeSize (l,u) > 0  &&  rangeSize (m,u) > 0    ==>    last (range (l,u))  ==  last (range (m,u))

The stricter variant (that enforces more intuitive Ix instances?) would be

rangeSize (l,u) > 0    ==>       l == head (range (l,u))
                              && u == last (range (l,u))

Though these laws assume a range that is ordered in a meaningful way.

So let's assume the range is not ordered in a meaningful way.
The trouble is that the notion of bounds obviously demand an Ord instance. Clearly

isLowerBound, isUpperBound :: (a,a) -> a -> Bool
isLowerBound (l,u) i = all (i <=) (range (l,u))
isUpperBound (l,u) i = all (<= i) (range (l,u))

However we do not know whether l itself is lower bound or an argument to a function that computes the lower bound. Further if we do not know whether inRange (l,u) l, then there are actually many lower bounds (what about an empty range?). Are we talking about an infimum or a minimum? If we cannot even be sure that

inRange (l,u) i && inRange (l,u) j    ==>    i == j  ==  index (l,u) i == index (l,u) j

we should rather refer to minimal elements. I give up at trying to come up with a law here. I agree the Ord instance without laws is sad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants