Skip to content

Pre-aggregations give incorrect result for rolling_window measure #8745

Open
@igorlukanin

Description

@igorlukanin

Describe the bug
The same query yields different results when executed with and without pre-aggregations. Looks like incorrect SQL is generated for Cube Store.

To Reproduce

  1. Use the data model shown below.
  2. Run the following query:
{
  "measures": [
    "Order.rollingCount"
  ],
  "limit": 5000,
  "timeDimensions": [
    {
      "dimension": "Order.createdAt",
      "dateRange": [
        "2023-07-11",
        "2023-07-18"
      ]
    }
  ]
}
  1. Get NULL as a result:
Screenshot 2024-09-25 at 18 28 41 4. Check that the generated SQL is as follows (with a suspicious wrapping around `base`):
SELECT
  sum(
    CASE
      WHEN null <= to_timestamp(?) THEN `order__rolling_count`
    END
  ) `order__rolling_count`
FROM
  (
    SELECT
      date_trunc('day', `order__created_at_day`) `order__created_at_day`,
      sum(`order__rolling_count`) `order__rolling_count`
    FROM
      prod_pre_aggregations.order_total_by_day AS `order__total_by_day`
    WHERE
      (`order__created_at_day` <= to_timestamp(?))
    GROUP BY
      1
  ) `base`

Expected behavior
4. Comment out the pre-aggregation in the data model
6. Run the same query
7. Get 4 as the result
Screenshot 2024-09-25 at 18 30 38
8. See that different SQL is generated (no wrappings around base):

SELECT
  "order__rolling_count" "order__rolling_count"
FROM
  (
    SELECT
      count("order_rolling_count_cumulative__order".id) "order__rolling_count"
    FROM
      (
        SELECT
          1 AS id,
          '2023-07-01T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          2 AS id,
          '2023-07-05T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          3 AS id,
          '2023-07-11T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          4 AS id,
          '2023-07-15T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          5 AS id,
          '2023-07-21T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          6 AS id,
          '2023-07-25T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          7 AS id,
          '2023-07-29T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          8 AS id,
          '2023-08-02T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          9 AS id,
          '2023-08-06T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          10 AS id,
          '2023-08-12T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          11 AS id,
          '2023-08-16T00:00:00.000Z' :: TIMESTAMP AS created_at
        UNION ALL
        SELECT
          12 AS id,
          '2023-08-22T00:00:00.000Z' :: TIMESTAMP AS created_at
      ) AS "order_rolling_count_cumulative__order"
    WHERE
      (
        (
          "order_rolling_count_cumulative__order".created_at :: timestamptz AT TIME ZONE 'UTC'
        ) <= ($ 1 :: timestamptz :: timestamptz AT TIME ZONE 'UTC')
      )
  ) as q_0
LIMIT
  5000

Screenshots
If applicable, add screenshots to help explain your problem.

Minimally reproducible Cube Schema

cube(`Order`, {
  sql: `
    SELECT 1 AS id, '2023-07-01T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 2 AS id, '2023-07-05T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 3 AS id, '2023-07-11T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 4 AS id, '2023-07-15T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 5 AS id, '2023-07-21T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 6 AS id, '2023-07-25T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 7 AS id, '2023-07-29T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 8 AS id, '2023-08-02T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 9 AS id, '2023-08-06T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 10 AS id, '2023-08-12T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 11 AS id, '2023-08-16T00:00:00.000Z'::TIMESTAMP AS created_at UNION ALL
    SELECT 12 AS id, '2023-08-22T00:00:00.000Z'::TIMESTAMP AS created_at
  `,
  
  measures: {
    rollingCount: {
      sql: `id`,
      type: `count`,
      rollingWindow: {
        trailing: `unbounded`,
      },
    },
  },

  dimensions: {
      createdAt: {
        sql: `created_at`,
        type: `time`
      },
  },

  preAggregations: {
    totalByDay: {
      measures: [
        Order.rollingCount
      ],
      timeDimension: Order.createdAt,
      granularity: `day`,
      refreshKey: {
        every: `1 minute`,
      },
    },
  }
})

Version:
v0.36.2

Additional context
Found while investigating #6951

Metadata

Metadata

Assignees

Labels

backend:serverIssues relating to Cube Core's Servercube storeIssues relating to Cube Store

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions