-
Notifications
You must be signed in to change notification settings - Fork 43
Nested query results
SQL queries take "flat" tables (i.e. multisets of records whose fields are values of "base" type) and produce "flat" tables as results. By default, Links supports SQL-like queries with flat inputs and flat outputs, for example:
for (d <-- departments)
for (e <-- employees)
where(d.name == e.dept)
[(dept = d.name, name = e.name)]
This query will return a table with one row for each employee, giving their name and their department name. There is a fair amount of redundancy: if there are 50 people in the "Department of Redundancy Department", then there will be 50 copies of the department name in the result.
Links now also supports nested query results using a technique called query shredding, described in a SIGMOD 2014 paper and tech report referenced below. Using the query nested
keyword combination will evaluate queries whose results involve arbitrary combinations of collection (list/multiset), record, and base types. For example:
query nested {
for (d <-- departments)
[(dept = d.name, emps = for (e <-- employees)
where(d.name == e.dept)
[e.name])]
}
Notice that the above query has return type [(dept:String,emps:[String])]
, i.e. it is not a flat table result but has some nesting of collections in the emps
field. By default Links would reject this query because the result is not flat. With shredding enabled, Links executes this query; it is translated to two SQL queries yielding flat results and then the results are "stitched" together into a nested Links value. Moreover, with shredding enabled Links can handle queries with arbitrary nesting in the query result, subject to some caveats below.
Experience with GtoPdb suggests that using query nested
can result in good performance compared to a conventional (perhaps overly conservative) data access layer implemented in Java [Fowler et al. 2020].
Shredding is currently only known to be supported by PostgreSQL and more recent versions of SQLite3 and MySQL. It should work with other databases that support SQL:1999 but we haven't tested support with them. Trying to use shredding with a database not supporting SQL:1999 features such as ROW_NUMBER
will result in undefined behavior (probably just a runtime exception). It would be good to catch this statically or, failing that, make sure that the runtime exception is not too mysterious.
Links supports some simple forms of aggregation (e.g. SQL's count
, written in Links as length
), limiting (e.g. SQL's limit N
) and sorting (e.g. order by
). The interactions between nested query results and these features are not well understood, so for queries that rely on these features, shredding may not work.
Currently Links supports two query policies, namely nested
(described above) and flat
(the classic behavior, which supports only flat query results and also supports some features such as count, ordering and limit that are not supported by query nested
. Other policies may be added e.g. to support experimental features or prototype improvements to query support.