Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AA Tree and tests #203

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

Add AA Tree and tests #203

wants to merge 11 commits into from

Conversation

hummy123
Copy link

I wasn't able to run the test in this repository as I don't have .NET 6.0.401 SDK installed (only a version matching .NET 7), but I don't expect anything to break since it was just mostly a copy-paste from my other repository.

Appreciate the time taken in reviewing this.

@gdziadkiewicz gdziadkiewicz self-assigned this Dec 13, 2022
@gdziadkiewicz
Copy link
Collaborator

Checking.

/// A balanced binary tree similar to a red-black tree which may have less predictable performance.
type AaTree<'T when 'T: comparison> =
| E
| T of int * AaTree<'T> * 'T * AaTree<'T>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using names for the T case data as in the Shape example here - https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/discriminated-unions .

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Would be helpful when editing the tree itself to specify the different data types mean. (Added.)

testList "AaTree" [

(* Existence tests. *)
test "test isEmpty" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we divide the tests to have test per assertion/aspect and explain in the test name which aspect of isEmpty we are testing? Here I would go with something along the lines of "test isEmpty returns true for an empty aatree" and "test isEmpty returns false for a single element aatree". This is additional work for you to do now, but it will make future maintenance easier.


(* Existence tests. *)
test "test isEmpty" {
Expect.isTrue <| AaTree.isEmpty AaTree.empty <| ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not use empty assertion messages


test "test tryFind" {
let tree = AaTree.ofList ["hello"; "bye"]
Expect.equal (Some("hello")) <| AaTree.tryFind "hello" tree <| ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using AAA(arrange, act, assert) structure for the test.
Maybe:

let tree = AaTree.ofList ["hello"; "bye"]
let result = AaTree.tryFind "hello" tree
Expect.equal (Some "hello" ) result  "tryFind should find hello in aatree created from [hello; bye]"

let array = [|1; 2; 3; 4; 5|]
let tree = AaTree.ofArray array
for i in array do
Expect.isTrue <| AaTree.exists i tree <| ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expect.containsAll?

let inputList = [0;1;2;3]
let tree = AaTree.ofList inputList
let outputList = AaTree.toList tree
Expect.equal outputList inputList ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it guaranteed that AaTree.toList output will be sorted?

open FSharpx.Collections
open System.Collections.Generic

(* Implementation guided by following paper: https://arxiv.org/pdf/1412.4882.pdf *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will be able to familiarize myself with it on Tuesday.

@hummy123
Copy link
Author

I think all of these are good suggestions and appreciate your rigorous standards. I'll have them implemented in the next few days hopefully.


interface IEnumerable<'T> with
member x.GetEnumerator() =
(x.ToList() :> _ seq).GetEnumerator()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AATree.toSeq?

let toList (tree: AaTree<'T>) =
foldBack (fun a e -> e::a) [] tree

let toSeq (tree: AaTree<'T>) =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we provide a lazy implementation? IMO it is unexpected and surprising for the user that just getting the seq, without even using it is O(n). Maybe something along the lines of:

    let rec toSeq (tree: AaTree<'T>) =
        seq{
            match tree with
            | E -> ()
            | T(_, l, v, r) ->
                yield! toSeq l
                yield v
                yield! toSeq r
        }

?

open FSharpx.Collections
open FSharpx.Collections.Experimental
open Expecto
open Expecto.Flip
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that you are not using the flip style and that the code will compile again if you remove it.

@hummy123
Copy link
Author

Hi @gdziadkiewicz and thanks again for the feedback. (I personally think it's easier to reply "here" rather than you jumping up/down the page navigating to different comments, but let me know if you prefer the other way of individual threads.)

  1. I divided the tests to have one assertion each test and gave more descriptive test names as asked (good idea).
  2. I used non-empty assertion messages in my new commit, but to be honest I think the messages I wrote are a bit tautologous with the test's name. (I didn't find much clear guidance on what to write from the other tests here like IntMap.fs but happy to revise them if asked.)
  3. I tried to rewrite some/most of the compact tests to use AAA as recommended (good idea again).
  4. You suggested using Expect.containsAll for some of the "Conversion from tests" (AaTree.ofList/Array/Seq), which I understand over the for loop that iterates over the input list and asserts AaTree.exists for each of them.
  • I think your way is definitely cleaner, but if we wanted to use Expect.containsAll we would need to use AaTree.toList/Array/Seq to build a list from the tree first.
  • I'm not sure if the mutual recursion with conversion methods (testing ofList with toList) is something I should worry about, but happy to make use of it in the test if asked (we have conversion toList/Array/Seq methods that build a list with of and check the output with to, so I think in that case we would be collapsing the conversion to/from tests since their logic would be exactly the same.
  1. It's guaranteed that AaTree's conversion to methods will be sorted, as it does an in-order traversal of the AA Tree.
  2. Good point. Replaced with toSeq as suggested, but the :> cast is still there (get a type inference error which I'm not sure how to solve in an OOP-style definition).
  3. I haven't really worked with Sequences in F# (kind of new to the language/FP in general), so I don't feel confident writing a lazy toSeq implementation but I think it's a good idea. Happy to replace the current impplementation with your code (although I can't write test for a lazy function because of my inexperience with them).
  4. Removed the open for Expecto.Flip as suggested and have the tests running successfully.
    image

@hummy123
Copy link
Author

hummy123 commented Jan 2, 2023

I reviewed my code and compared it with the paper (and also with the Isabelle HOL proof linked below which corrects the paper in a couple of places) and made some changes.

Summary of my mistakes (now fixed):

  • My code only called the adjust function when deleting the left child when it should call the function in all cases (less than, greater than, equal).

Summary of changes from the proof (corrects errors in the paper, and corrections now implemented in code):

  • The function dellrg which the proof calls split_max recursed the left subtree in the original but is meant to recurse the right sutree.
  • Fixed calculation for the nlvl helper function.

Isabelle HOL proof for reference: https://isabelle.in.tum.de/library/HOL/HOL-Data_Structures/AA_Set.html .

@gdziadkiewicz
Copy link
Collaborator

@hummy123 the CI build is failing due to problems with code formatting. Could you resolve that?

@@ -75,6 +72,15 @@ module AaTree =
then split <| (skew <| T(h, l, v, insert item r))
else node

(* nlvl function fixed according to Isabelle HOL prrof below: *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prrof -> proof

@hummy123
Copy link
Author

hummy123 commented Jan 4, 2023

@gdziadkiewicz Got it working thanks to your help (haven't used FAKE/GitHub actions much before).

Here's the action running successfully on my own fork: https://github.com/hummy123/FSharpx.Collections/actions/runs/3841007372/jobs/6540731078 .

(Also updated FAKE as it was giving me errors.)

@gdziadkiewicz
Copy link
Collaborator

@hummy123 I'm sorry that I'm not able to work on it quicker. Would you mind if I add some property tests to your code and change the toSeq implementation to the one I proposed?

@hummy123
Copy link
Author

@gdziadkiewicz No worries - appreciate your time. Anything that makes the code better and tests more resilient sounds good to me, so go ahead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants