[Performance, Columnar] window function in subquery greatly slows down query, even when unreferenced #7660

srblum · 2024-07-19T22:45:06Z

Problem

When using large columnar tables, window functions in subqueries greatly slow down query performance, even when the window function columns are not referenced by the outer query.

I am hoping that this could be fixed by adjusting columnar scans to ignore columns in subqueries that are unreferenced.

I am more than happy to provide additional context, including query plans, plpgsql functions to recreate test data, version info, etc.

Thanks for your time and support. 🙇

Why would someone ever need an unreferenced window function?

I work on a manufacturing analytics platform that allows users to create and update views of their time-series data with arbitrary expressions. Some of those expressions contain window functions.

Below is an example of a command to create such a view, containing a trivial expression.

CREATE VIEW v_washer AS (
    SELECT *, 1 AS x
    FROM washer
);

After the view is created, it becomes the new basis for all users' queries on the platform, regardless of whether those queries need any of the computed expressions. This has been a fantastic way to allow users to easily customize their data.

Unfortunately, as soon as any user adds a window function to the view, it spoils load times for everyone.

What does the data look like?

Manufacturing time-series data. Each row represents the state of a machine for a given time interval. The data used for testing was structured as follows:

starttime	endtime	machine__source	...700 other columns
2021-01-01 00:00:00+00	2021-01-01 00:00:30+00	A	...
2021-01-01 00:00:30+00	2021-01-01 00:01:00+00	A	...
2021-01-01 00:01:00+00	2021-01-01 00:01:30+00	A	...
...	...	...	...

The test queries below were performed on a partitioned table with the following properties:

700 columns
12 million total rows
3 years of data, split into 36 monthly, columnar partitions
Each of the 36 partitions is about 36MB for a total of ~1.3 GB (highly compressed--the same data is ~100GB in row-based storage.)

Query performance

Below are average query speeds for each of 7 test queries. All queries were performed 3 times after throwing out the first result to remove the variance associated with caching.

Query	Time
Query1 (small subquery, no wf)	0.09s
Query2 (large subquery, no wf)	0.09s
Query3 (no subquery, wf)	0.15s
Query4 (small subquery, ignored wf)	3.0s
Query5 (small subquery, referenced wf)	6.2s
Query6 (large subquery, ignored wf)	7.5s
Query7 (large subquery, referenced wf)	9.8s

My hope is that, by fixing the query planner:

Query4 and Query6 will have times close to Query1 and Query2
Query7 will have a time close to Query5

Full Queries

Below are the full queries referenced in the table above.

Query1 (small subquery, no wf)

select sum("inner".output2) from (
    select
        endtime,
        machine__source,
        output + 1 as output2
    from timeseries
) as "inner"
where "inner".machine__source in ('A') and
"inner".endtime > '2023-09-01'::timestamptz;

Query2 (large subquery, no wf)

select sum("inner".output2) from (
    select
    	*,
    	output + 1 as output2
    from timeseries
) as "inner"
where "inner".machine__source in ('A') and
"inner".endtime > '2023-09-01'::timestamptz;

Query3 (no subquery, wf)

select
	case
		when
			shift = 'A' and
			lag(shift) over (partition by machine__source order by endtime) != 'A'
		then 1
		else 0
	end as window_expr
from timeseries
where machine__source in ('A') and
endtime > '2023-09-01'::timestamptz;

Query4 (small subquery, ignored wf)

select "inner".output from (
    select
		output,
		machine__source,
		endtime,
        case
          when shift = 'A' and
          lag(shift) over (partition by machine__source order by endtime) != 'A'
          then 1
          else 0
        end as window_expr
    from timeseries
) as "inner"
where "inner".machine__source in ('A') and
"inner".endtime > '2023-09-01'::timestamptz;

Query5 (small subquery, referenced wf)

select "inner".window_expr from (
    select
		output,
		machine__source,
		endtime,
        case
          when shift = 'A' and
          lag(shift) over (partition by machine__source order by endtime) != 'A'
          then 1
          else 0
        end as window_expr
    from timeseries
) as "inner"
where "inner".machine__source in ('A') and
"inner".endtime > '2023-09-01'::timestamptz;

Query6 (large subquery, ignored wf)

select "inner".output from (
    select
        *,
        case
          when
            shift = 'A' and
            lag(shift) over (partition by machine__source order by endtime) != 'A'
          then 1
          else 0
        end as window_expr
    from timeseries
) as "inner"
where "inner".machine__source in ('A') and
"inner".endtime > '2023-09-01'::timestamptz;

Query7 (large subquery, referenced wf)

select "inner".window_expr from (
    select
        *,
        case
          when
            shift = 'A' and
            lag(shift) over (partition by machine__source order by endtime) != 'A'
          then 1
          else 0
        end as window_expr
    from timeseries
) as "inner"
where "inner".machine__source in ('A') and
"inner".endtime > '2023-09-01'::timestamptz;

Version

# select * from version();
                                                        version                                                        
-----------------------------------------------------------------------------------------------------------------------
 PostgreSQL 14.11 (Debian 14.11-1.pgdg120+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit

# select * from pg_extension;
    oid    |    extname     | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition 
-----------+----------------+----------+--------------+----------------+------------+-----------+--------------
 867184318 | citus_columnar |       10 |           11 | f              | 11.3-1     |           |

Note: the issue was not resolved when running the same queries in PostgreSQL 16.3

The text was updated successfully, but these errors were encountered:

srblum changed the title ~~Columnar: window function in subquery greatly slows down query, even when unreferenced~~ [Performance] Columnar: window function in subquery greatly slows down query, even when unreferenced Jul 19, 2024

srblum changed the title ~~[Performance] Columnar: window function in subquery greatly slows down query, even when unreferenced~~ [Performance, Columnar] window function in subquery greatly slows down query, even when unreferenced Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance, Columnar] window function in subquery greatly slows down query, even when unreferenced #7660

[Performance, Columnar] window function in subquery greatly slows down query, even when unreferenced #7660

srblum commented Jul 19, 2024 •

edited

Loading

[Performance, Columnar] window function in subquery greatly slows down query, even when unreferenced #7660

[Performance, Columnar] window function in subquery greatly slows down query, even when unreferenced #7660

Comments

srblum commented Jul 19, 2024 • edited Loading

Problem

Why would someone ever need an unreferenced window function?

What does the data look like?

Query performance

Full Queries

Query1 (small subquery, no wf)

Query2 (large subquery, no wf)

Query3 (no subquery, wf)

Query4 (small subquery, ignored wf)

Query5 (small subquery, referenced wf)

Query6 (large subquery, ignored wf)

Query7 (large subquery, referenced wf)

Version

srblum commented Jul 19, 2024 •

edited

Loading