Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queries do not scale well when table grows in size #77

Open
AlexanderArvidsson opened this issue Sep 10, 2024 · 14 comments
Open

Queries do not scale well when table grows in size #77

AlexanderArvidsson opened this issue Sep 10, 2024 · 14 comments

Comments

@AlexanderArvidsson
Copy link

AlexanderArvidsson commented Sep 10, 2024

First of all, great library, I love the simplicity! But I think there's a big scaling problem here:

From reading how the query works, and analyzing it, it seems that (correct me if I'm wrong), it goes through every root in the entire table, creates the trees, then selects descendants or ancestors from that.
For example, descendants is selected by finding any row that includes the parent in the __tree.tree_path.

In our case, we tested performance with a table that was 100k+ rows with deeply nested rows.
We had one tree with about 150k nested rows, with the rest of the table being very small trees at just a couple of hundred. Because this one tree existed, it affected performance of even empty trees.

WITH RECURSIVE __rank_table(

        "id",
        "parent_id",
        "rank_order"
    ) AS (
        SELECT "storage_directory"."id", "storage_directory"."parent_id", ROW_NUMBER() OVER (ORDER BY "storage_directory"."id") AS "rank_order" FROM "storage_directory"
    ),
    __tree (

        "tree_depth",
        "tree_path",
        "tree_ordering",
        "tree_pk"
    ) AS (
        SELECT

            0,
            array[T.id],
            array[T.rank_order],
            T."id"
        FROM __rank_table T
        WHERE T."parent_id" IS NULL

        UNION ALL

        SELECT

            __tree.tree_depth + 1,
            __tree.tree_path || T.id,
            __tree.tree_ordering || T.rank_order,
            T."id"
        FROM __rank_table T
        JOIN __tree ON T."parent_id" = __tree.tree_pk
    )
    SELECT (__tree.tree_depth) AS "tree_depth", (__tree.tree_path) AS "tree_path", (__tree.tree_ordering) AS "tree_ordering", "storage_directory"."id", "storage_directory"."parent_id" FROM "storage_directory" , "__tree" WHERE ((104113 = ANY(__tree.tree_path)) AND NOT ("storage_directory"."id" = 104113) AND (__tree.tree_pk = storage_directory.id)) ORDER BY ("__tree".tree_ordering) ASC

This query took several seconds to complete, even when selecting a parent that had literally no children at all. That's because, regardless of what parent is selected, the query will still calculate every single tree in the entire table.

By making a simple change to the base select condition:

            SELECT

            0,
            array[T.id],
            array[T.rank_order],
            T."id"
        FROM __rank_table T
        WHERE T."id" = 104113

and removing the ANY check:

        WHERE ((NOT ("storage_directory"."id" = 104113) AND (__tree.tree_pk = storage_directory.id)) ORDER BY ("__tree".tree_ordering) ASC

This query was reduced to a mere 50ms.

I know the above change is obviously not scalable and might very well just apply to this query, and it obviously doesn't work for ancestors method, but I am still questioning the decision to calculate every single tree in the entire table and then extracting the tree.

I'm wondering, is there a way to change this without drastic changes, so that it scales well with huge tables? For descendants, it just need to traverse down from the selected parent, and for ancestors it just needs to traverse up.
If tracking depth, path, etc is important, then maybe descendants could first traverse upwards until it finds the root node, then traverse down from there and select the subtree?

Thanks!

@matthiask
Copy link
Member

Thanks! I haven't yet had the problem where trees became so big that the performance suffered too much.

I have a list of interesting links stashed away, e.g.
https://schinckel.net/2016/01/19/postgres-tree-shootout-part-3%3A-adjacency-list-using-views/
https://schinckel.net/2016/01/27/django-trees-via-closure-view/

I suspect that converting the __tree table to use a (materialized?) view could be a good option for this library but I haven't yet looked into it.

This query was reduced to a mere 50ms.

The queries aren't equivalent at all though? =any() ensures that the given ID isn't ever a part of the tree_path which filters out all descendants while the version without the =any() only filters out the exact node? I'm a bit surprised at the speed up, since the tree still has to be built?

This query took several seconds to complete, even when selecting a parent that had literally no children at all. That's because, regardless of what parent is selected, the query will still calculate every single tree in the entire table.

The docs are lacking in this area, but .tree_filter() could be useful to filter nodes before constructing the tree, see #66

@AlexanderArvidsson
Copy link
Author

AlexanderArvidsson commented Sep 11, 2024

Thanks! I haven't yet had the problem where trees became so big that the performance suffered too much.

I have a list of interesting links stashed away, e.g. https://schinckel.net/2016/01/19/postgres-tree-shootout-part-3%3A-adjacency-list-using-views/ https://schinckel.net/2016/01/27/django-trees-via-closure-view/

I suspect that converting the __tree table to use a (materialized?) view could be a good option for this library but I haven't yet looked into it.

This query was reduced to a mere 50ms.

The queries aren't equivalent at all though? =any() ensures that the given ID isn't ever a part of the tree_path which filters out all descendants while the version without the =any() only filters out the exact node? I'm a bit surprised at the speed up, since the tree still has to be built?

This query took several seconds to complete, even when selecting a parent that had literally no children at all. That's because, regardless of what parent is selected, the query will still calculate every single tree in the entire table.

The docs are lacking in this area, but .tree_filter() could be useful to filter nodes before constructing the tree, see #66

Interesting links, thanks! Coincidentally, I think the first link has a query that is similar to one that our developers wrote (if you exclude the path and depth info):

with recursive tree (id) as (
  -- Base case
  SELECT id
  from storage_directory
  where id = 104113

  UNION ALL
  -- Rescurse
  select sd.id
  from storage_directory sd
  join tree on sd.parent_id = tree.id
) select count(*) from tree;

We're really only interested in getting the descendants / ancestors, and for us it would be completely fine to write these queries on our own as 2 different queries, rather than try and make one query that works for both cases. So maybe that's what we're going to do, maybe with something like django-cte, as a last resort unless we can fix these performance problems.

I suspected the queries weren't equivalent, but we did get back the same rows in both of our tests where we selected a node with many descendants (a non-parent node) in both queries.
My assumption was that the second query starts its recursive cte at the selected node, then gets all descendants, while the original query selects all root nodes, then gets the subtree via the =any() call. Since the second query starts at the selected node, it doesn't need a =any() node, and it also doesn't go through trees in the whole database.

Big disclaimer tho; I haven't used recursive CTE's before, so this is all based on some rudimentary testing.

But my concern still remains; the original query constructs every single possible tree in the table on every query, instead of just the one you're targeting.
Unless I'm misunderstanding tree_filter, wouldn't I still need to know all descendants in order to filter for the specific tree I'm targeting? I guess I could apply a broader filter, since we have a project_id reference on the nodes which should limit the searching to within the project instead of the whole table.

@matthiask
Copy link
Member

Ah, you're right. descendants() could indeed be optimized a lot. We'd need a way to modify this line in the PostgreSQL case:

WHERE T."{parent}" IS NULL

The tree_filter indeed does not help in this case. That's sad. PostgreSQL still treats CTEs as optimization fences (at least I think it does), so it would still build the rank table for the all rows, but at least it wouldn't build the full tree, only the required subtree.

@matthiask
Copy link
Member

Thinking out loud, I wonder if we could make an easy, non-intrusive fix just for this case, or if we should better pursue the materialized view optimization for very large trees or for read-heavy workloads. I lean towards the latter, but the perfect is also the enemy of the good. I hope we can avoid adding many small band-aid fixes when we could fix the performance issue once for all.

@AlexanderArvidsson
Copy link
Author

AlexanderArvidsson commented Sep 11, 2024

Our monitoring is observing that ancestors() has the same problem, BTW. They both use the same query.

I forked your repo and was about to do the exact change you said, with the WHERE condition. But I got stuck since I am not familiar with custom SQLCompilers.

I liked the fact that this library doesn't require building up any pre-existing structure that had to be maintained, like your previous django-mptt did (but in-table columns) or materialized views. Of course, materialized views would probably be a lot faster, but if our rudimentary testing proves anything, it's that recursive CTE's on their own could handle hundreds of thousands of rows, as long as your base condition is sane.

@matthiask
Copy link
Member

Thanks for profiling this! Then it sounds like small fixes here and there would work well.

@AlexanderArvidsson
Copy link
Author

AlexanderArvidsson commented Sep 11, 2024

We wrote these two queries, which from first glance works. We'd need to integrate it into the platform and have our tests run before confirming it though.
We're running Postgres 15.6, and has not tested this with MySQL / MariaDB / etc.

Descendants:

with recursive tree (id, depth, path, sd) as (
  -- Base case
  select
    id,
    0 as depth,
    array[id] as path,
    *
  from storage_directory
  where id = 103771

  UNION ALL
  -- Recurse
  select 
    sd.id,
    tree.depth+1,
    tree.path || sd.id,
    sd.*
  from storage_directory sd
  join tree on sd.parent_id = tree.id
)
select tree.depth, tree.id, tree.path, tree.name
from tree
order by (tree.depth, tree.id);

Explain:

Sort  (cost=960.06..960.49 rows=171 width=588)
  Sort Key: (ROW(tree.depth, tree.id))
  CTE tree
    ->  Recursive Union  (cost=0.28..950.30 rows=171 width=126)
          ->  Index Scan using storage_directory_pkey on storage_directory  (cost=0.28..8.30 rows=1 width=126)
                Index Cond: (id = 103771)
          ->  Nested Loop  (cost=0.28..93.86 rows=17 width=126)
                ->  WorkTable Scan on tree tree_1  (cost=0.00..0.20 rows=10 width=40)
                ->  Index Scan using storage_directory_parent_id_e8eef042 on storage_directory sd  (cost=0.28..9.34 rows=2 width=86)
                      Index Cond: (parent_id = tree_1.id)
  ->  CTE Scan on tree  (cost=0.00..3.42 rows=171 width=588)

Ancestors:

 with recursive tree (id, steps, sd) as (
  -- Base case
  select
  	id,
  	0 as steps,
  	*
  from storage_directory
  where id = 137811

  UNION ALL
  -- Recurse
  select
    sd.id,
    tree.steps+1,
    sd.*
  from storage_directory sd
  join tree on sd.id = tree.parent_id
)
select tree.id, tree.parent_id, tree.steps, tree.name
from tree
order by steps DESC;

Explain:

Sort  (cost=847.75..848.00 rows=101 width=528)
  Sort Key: tree.steps DESC
  CTE tree
    ->  Recursive Union  (cost=0.28..842.37 rows=101 width=94)
          ->  Index Scan using storage_directory_pkey on storage_directory  (cost=0.28..8.30 rows=1 width=94)
                Index Cond: (id = 137811)
          ->  Nested Loop  (cost=0.28..83.20 rows=10 width=94)
                ->  WorkTable Scan on tree tree_1  (cost=0.00..0.20 rows=10 width=8)
                ->  Index Scan using storage_directory_pkey on storage_directory sd  (cost=0.28..8.30 rows=1 width=86)
                      Index Cond: (id = tree_1.parent_id)
  ->  CTE Scan on tree  (cost=0.00..2.02 rows=101 width=528)

Both of these are performing really well according to the explain queries. We will try running these in a database with 100k+ rows.
The * is to avoid doing a JOIN on storage_directory, because that introduces a SEQ SCAN which we want to absolutely avoid. By including all columns in the recurse, we skip joining. In Django, maybe this can be dynamically changed based on values, but I don't really mind it atm.

One thing to note is that this drops the fact that descendants tree_path includes its ancestors as well, and makes the assumption that if you want to get the full path for each node, you need to manually join ancestors and the descendant path yourself.

The ordering is very much simplified by using order by (depth, id) over the rank_table, was there any reason you had to do it that way?

@AlexanderArvidsson
Copy link
Author

AlexanderArvidsson commented Sep 11, 2024

Compare the above with EXPLAIN on the original descendants query:

Sort  (cost=3620.13..3620.46 rows=130 width=76)
  Sort Key: __tree.tree_ordering
  CTE __rank_table
    ->  WindowAgg  (cost=496.70..553.49 rows=3245 width=16)
          ->  Sort  (cost=496.70..504.81 rows=3245 width=8)
                Sort Key: storage_directory_1.id
                ->  Seq Scan on storage_directory storage_directory_1  (cost=0.00..307.45 rows=3245 width=8)
  CTE __tree
    ->  Recursive Union  (cost=0.00..1861.41 rows=25976 width=72)
          ->  CTE Scan on __rank_table t  (cost=0.00..64.90 rows=16 width=72)
                Filter: (parent_id IS NULL)
          ->  Hash Join  (cost=5.20..127.70 rows=2596 width=72)
                Hash Cond: (t_1.parent_id = __tree_1.tree_pk)
                ->  CTE Scan on __rank_table t_1  (cost=0.00..64.90 rows=3245 width=16)
                ->  Hash  (cost=3.20..3.20 rows=160 width=72)
                      ->  WorkTable Scan on __tree __tree_1  (cost=0.00..3.20 rows=160 width=72)
  ->  Hash Join  (cost=356.11..1200.67 rows=130 width=76)
        Hash Cond: (__tree.tree_pk = storage_directory.id)
        ->  CTE Scan on __tree  (cost=0.00..844.22 rows=130 width=72)
              Filter: (104113 = ANY (tree_path))
        ->  Hash  (cost=315.56..315.56 rows=3244 width=8)
              ->  Seq Scan on storage_directory  (cost=0.00..315.56 rows=3244 width=8)
                    Filter: (id <> 104113)

This does 2 Seq scans, which has to go through every single row in the table twice, most likely also has to do disk reads because these might not be in memory. Then it has to order the rank table, which contains every single row in the table. The CTE scan is probably in-memory, since it does a Seq scan on the materialized result of the CTE, but it still contains every single row in the table.
Not to mention the WindowAgg on the Seq scan in the rank table.

@AlexanderArvidsson
Copy link
Author

AlexanderArvidsson commented Sep 11, 2024

Sorry for the spam (let me know if all this info is overwhelming you!); but I wanted to let you know that we managed to implement the descendants query via django-cte:

    def descendants(self, include_self=False, **kwargs):
        """
        Returns all descendants of the current node
        """
        if not self.pk:
            raise ValueError("Cannot get descendants of an unsaved node")

        def make_cte(cte):
            # non-recursive: get root nodes
            return (
                Directory.objects.filter(id=self.pk)
                .annotate(
                    path=RawSQL("array[id]", ()),
                    depth=Value(0, output_field=IntegerField()),
                )
                .union(
                    # recursive union: get descendants
                    cte.join(Directory, parent_id=cte.col.id).annotate(
                        path=Func(
                            cte.col.path,
                            F("id"),
                            function="array_append",
                            output_field=ArrayField(IntegerField()),
                        ),
                        depth=cte.col.depth + Value(1, output_field=IntegerField()),
                    ),
                    all=True,
                )
            )

        cte = With.recursive(make_cte)

        qs = cte.queryset().with_cte(cte).order_by(cte.col.depth, cte.col.id)

        if not include_self:
            qs = qs.exclude(id=self.pk)

        return qs

I'm wondering; could this library be simplified if it leveraged django-cte? Alternatively, if this library focused on Materialized View representation of the trees, it would offer an potentially much faster alternative to the recursive CTE.

@matthiask
Copy link
Member

matthiask commented Sep 11, 2024

I'm wondering; could this library be simplified if it leveraged django-cte? Alternatively, if this library focused on Materialized View representation of the trees, it would offer an potentially much faster alternative to the recursive CTE.

When I started out with this, django-cte wasn't advanced enough and I needed/wanted something simple quickly. I can very well imagine that django-cte could be revisited, either as an alternative or as a base. It still doesn't feel too good since it's maintained somewhat actively, but not too actively -- the test suite only tests Django 4.2, nothing newer.

Alternatively, if this library focused on Materialized View representation of the trees, it would offer an potentially much faster alternative to the recursive CTE.

Yes, this could be a good plan.

It really depends on the complexity tax we have to pay in django-tree-queries. As you say, use cases up to a few hundred or thousand nodes do not need the optimization. Larger trees do need it, but I'm a bit selfish here and do not want to pay too much of this tax myself since the solution doesn't matter too much to my use cases. The tree building definitely appears in traces though and it would probably make sense anyway.

Sorry for the spam (let me know if all this info is overwhelming you!);

Not at all, it's interesting! I appreciate you spending the time explaining all of this and bringing new ideas to the table.

@ihasdapie
Copy link

For what it's worth, I have a table with about 400k entries and found that the performance of ancestors() and descendants() has dropped significantly. I really liked the simplicity of this library, but the performance issues are starting to show.

@AlexanderArvidsson do you have a working snippet for ancestors()? How did you integrate this, and were there any caveats in your testing?

@matthiask
Copy link
Member

I'm sorry to hear that. Can you pinpoint the change which lead to this regression? Probably the introduction of the rank table?

I thought allowing more types of ordering was a good idea, but maybe it wasn't! There's certainly a way to make this work again, and if not, removing the offending features would be worth a thought certainly.

@ihasdapie
Copy link

I'm on version 0.16.1, I'm not sure what version exactly the regression came in since the table only got that big recently.

@AlexanderArvidsson
Copy link
Author

AlexanderArvidsson commented Oct 17, 2024

For what it's worth, I have a table with about 400k entries and found that the performance of ancestors() and descendants() has dropped significantly. I really liked the simplicity of this library, but the performance issues are starting to show.

@AlexanderArvidsson do you have a working snippet for ancestors()? How did you integrate this, and were there any caveats in your testing?

Yes, I do. We just put this as a method on the django model. No caveats so far, it is doing very well for us and we have a lot of rows.

    def ancestors(self, of=None, include_self=False):
        """
        Returns all ancestors of the current node, in order of their depth relative to the root
        """
        if not self.pk:
            raise ValueError("Cannot get ancestors of an unsaved node")

        def make_cte(cte):
            # non-recursive: get current node
            return (
                Directory.objects.filter(id=pk(of or self.pk))
                .annotate(
                    _steps=Value(0, output_field=IntegerField()),
                )
                .union(
                    # recursive union: get ancestors
                    cte.join(Directory, id=cte.col.parent_id).annotate(
                        _steps=cte.col._steps + Value(1, output_field=IntegerField()),
                    ),
                    all=True,
                )
            )

        cte = With.recursive(make_cte)

        qs = cte.queryset().with_cte(cte).order_by(cte.col._steps.desc())

        if not include_self:
            qs = qs.exclude(id=self.pk)

        return qs

Since I reported this issue, we've unfortunately had to ditch django-tree-queries and rely on django-cte, allowing us to write our own CTEs using django syntax. We don't need the rank table or anything complex like that (multi-db, etc), and the huge performance impact it had meant we had to look for alternatives.

@matthiask Don't let any of this discourage you (you shouldn't apologize, your use-case just wasn't the same as ours), I'm grateful for this library and your previous django-mptt library, they worked well for years. Some of the fundamental decisions means it can't perform well under a scaling database, but what's written in this issue so far summarizes it well, so I think there's lots of info for you to solve these issues! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants