Skip to content

Commit 4d4a440

Browse files
benitavsimolus3
andauthored
raw tables support (WIP) (#198)
* init * Code snippets * Fix JS typo * More typos * Make it clearer that powersync_crud is virtual * polish * polish * mention raw tables in client architecture * more fitting icon * Consolidate message * additional polish * Mention release versions for JS * More notes on migrations * Add minimum versions for Dart and Kotlin * Title case --------- Co-authored-by: Simon Binder <[email protected]>
1 parent b57063f commit 4d4a440

File tree

4 files changed

+297
-3
lines changed

4 files changed

+297
-3
lines changed

architecture/client-architecture.mdx

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,4 +64,8 @@ The client SDK maintains the following tables:
6464

6565
Most rows will be present in at least two tables — the `ps_data__<table>` table, and in `ps_oplog`. It may be present multiple times in `ps_oplog`, if it was synced via multiple buckets.
6666

67-
The copy in `ps_oplog` may be newer than the one in `ps_data__<table>`. Only when a full checkpoint has been downloaded, will the data be copied over to the individual tables. If multiple rows with the same table and id has been synced, only one will be preserved (the one with the highest `op_id`).
67+
The copy in `ps_oplog` may be newer than the one in `ps_data__<table>`. Only when a full checkpoint has been downloaded, will the data be copied over to the individual tables. If multiple rows with the same table and id has been synced, only one will be preserved (the one with the highest `op_id`).
68+
69+
<Note>
70+
If you run into limitations with the above JSON-based SQLite view system, check out [this experimental feature](/usage/use-case-examples/raw-tables) which allows you to define and manage raw SQLite tables to work around some limitations. We are actively seeking feedback about this functionality.
71+
</Note>

docs.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -187,6 +187,7 @@
187187
"usage/use-case-examples/offline-only-usage",
188188
"usage/use-case-examples/postgis",
189189
"usage/use-case-examples/prioritized-sync",
190+
"usage/use-case-examples/raw-tables",
190191
"usage/use-case-examples/custom-write-checkpoints"
191192
]
192193
},

usage/use-case-examples.mdx

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
---
22
title: "Use Case Examples"
33
description: "Learn how to use PowerSync in common use cases"
4-
mode: wide
5-
sidebarTitle: Overview
64
---
75

86
The following examples are available to help you get started with specific use cases for PowerSync:
@@ -19,6 +17,7 @@ The following examples are available to help you get started with specific use c
1917
<Card title="Local-only Usage" icon="laptop" href="/usage/use-case-examples/offline-only-usage" horizontal/>
2018
<Card title="PostGIS" icon="map" href="/usage/use-case-examples/postgis" horizontal/>
2119
<Card title="Prioritized Sync" icon="star" href="/usage/use-case-examples/prioritized-sync" horizontal/>
20+
<Card title="Raw SQLite Tables" icon="table" href="/usage/use-case-examples/raw-tables" horizontal/>
2221
</CardGroup>
2322

2423
## Additional Resources
Lines changed: 290 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,290 @@
1+
---
2+
title: "Raw SQLite Tables to Bypass JSON View Limitations"
3+
description: "Use raw tables for native SQLite functionality and improved performance."
4+
sidebarTitle: "Raw SQLite Tables"
5+
---
6+
7+
<Warning>
8+
Raw tables are an experimental feature. We're actively seeking feedback on:
9+
10+
- API design and developer experience
11+
- Additional features or optimizations needed
12+
13+
Join our [Discord community](https://discord.gg/powersync) to share your experience and get help.
14+
</Warning>
15+
16+
By default, PowerSync uses a [JSON-based view system](/architecture/client-architecture#schema) where data is stored schemalessly in JSON format and then presented through SQLite views based on the client-side schema. Raw tables allow you to define native SQLite tables in the client-side schema, bypassing this.
17+
18+
This eliminates overhead associated with extracting values from the JSON data and provides access to advanced SQLite features like foreign key constraints and custom indexes.
19+
20+
<Note>
21+
**Availability**
22+
23+
Raw tables were introduced in the following versions of our client SDKs:
24+
- __JavaScript__ (Node: `0.8.0`, React-Native: `1.23.0`, Web: `1.24.0`)
25+
- __Dart__: Version 1.15.0 of `package:powersync`.
26+
- __Kotlin__: Version 1.3.0
27+
28+
Also note that raw tables are only supported by the new [Rust-based sync client](https://releases.powersync.com/announcements/improved-sync-performance-in-our-client-sdks), which is currently opt-in.
29+
</Note>
30+
31+
## When to Use Raw Tables
32+
33+
Consider raw tables when you need:
34+
35+
- **Advanced SQLite features** like `FOREIGN KEY` and `ON DELETE CASCADE` constraints
36+
- **Indexes** - PowerSync's default schema has basic support for indexes on columns, while raw tables give you complete control to create indexes on expressions, use `GENERATED` columns, etc
37+
- **Improved performance** for complex queries (e.g., `SELECT SUM(value) FROM transactions`) - raw tables more efficiently get these values directly from the SQLite column, instead of extracting the value from the JSON object on every row
38+
- **Reduced storage overhead** - eliminate JSON object overhead for each row in `ps_data__<table>.data` column
39+
- **To manually create tables** - Sometimes you need full control over table creation, for example when implementing custom triggers
40+
41+
## How Raw Tables Work
42+
43+
### Current JSON-Based System
44+
45+
Currently the sync system involves two general steps:
46+
47+
1. Download sync bucket operations from the PowerSync Service
48+
2. Once the client has a complete checkpoint and no pending local changes in the upload queue, sync the local database with the bucket operations
49+
50+
The bucket operations use JSON to store the individual operation data. The local database uses tables with a simple schemaless `ps_data__<table_name>` structure containing only an `id` (TEXT) and `data` (JSON) column.
51+
52+
PowerSync automatically creates views on that table that extract JSON fields to resemble standard tables reflecting your schema.
53+
54+
### Raw Tables Approach
55+
56+
When opting in to raw tables, you are responsible for creating the tables before using them - PowerSync will no longer create them automatically.
57+
58+
Because PowerSync takes no control over raw tables, you need to manually:
59+
60+
1. Tell PowerSync how to map the [schemaless protocol](/architecture/powersync-protocol#protocol) to your raw tables when syncing data.
61+
2. Configure custom triggers to forward local writes to PowerSync.
62+
63+
For the purpose of this example, consider a simple table like this:
64+
65+
```sql
66+
CREATE TABLE todo_lists (
67+
id TEXT NOT NULL PRIMARY KEY,
68+
created_by TEXT NOT NULL,
69+
title TEXT NOT NULL,
70+
content TEXT
71+
) STRICT;
72+
```
73+
74+
#### Syncing into raw tables
75+
76+
To sync into the raw `todo_lists` table instead of `ps_data__`, PowerSync needs the SQL statements extracting
77+
columns from the untyped JSON protocol used during syncing.
78+
This involves specifying two SQL statements:
79+
80+
1. A `put` SQL statement for upserts, responsible for creating a `todo_list` row or updating it based on its `id` and data columns.
81+
2. A `delete` SQL statement responsible for deletions.
82+
83+
The PowerSync client as part of our SDKs will automatically run these statements in response to sync lines being sent from the PowerSync Service.
84+
85+
To reference the ID or extract values, prepared statements with parameters are used. `delete` statements can reference the id of the affected row, while `put` statements can also reference individual column values.
86+
Declaring these statements and parameters happens as part of the schema passed to PowerSync databases:
87+
88+
<Tabs>
89+
90+
<Tab title="JavaScript">
91+
Raw tables are not included in the regular `Schema()` object. Instead, add them afterwards using `withRawTables`.
92+
For each raw table, specify the `put` and `delete` statement. The values of parameters are described as a JSON
93+
array either containing:
94+
95+
- the string `Id` to reference the id of the affected row.
96+
- the object `{ Column: name }` to reference the value of the column `name`.
97+
98+
```JavaScript
99+
const mySchema = new Schema({
100+
// Define your PowerSync-managed schema here
101+
// ...
102+
});
103+
mySchema.withRawTables({
104+
// The name here doesn't have to match the name of the table in SQL. Instead, it's used to match
105+
// the table name from the backend database as sent by the PowerSync service.
106+
todo_lists: {
107+
put: {
108+
sql: 'INSERT OR REPLACE INTO todo_lists (id, created_by, title, content) VALUES (?, ?, ?, ?)',
109+
params: ['Id', { Column: 'created_by' }, { Column: 'title' }, { Column: 'content' }]
110+
},
111+
delete: {
112+
sql: 'DELETE FROM lists WHERE id = ?',
113+
params: ['Id']
114+
}
115+
}
116+
});
117+
```
118+
119+
We will simplify this API after understanding the use-cases for raw tables better.
120+
</Tab>
121+
122+
<Tab title="Dart">
123+
Raw tables are not part of the regular tables list and can be defined with the optional `rawTables` parameter.
124+
125+
```dart
126+
final schema = Schema(const [], rawTables: const [
127+
RawTable(
128+
// The name here doesn't have to match the name of the table in SQL. Instead, it's used to match
129+
// the table name from the backend database as sent by the PowerSync service.
130+
name: 'todo_lists',
131+
put: PendingStatement(
132+
sql: 'INSERT OR REPLACE INTO todo_lists (id, created_by, title, content) VALUES (?, ?, ?, ?)',
133+
params: [
134+
PendingStatementValue.id(),
135+
PendingStatementValue.column('created_by'),
136+
PendingStatementValue.column('title'),
137+
PendingStatementValue.column('content'),
138+
],
139+
),
140+
delete: PendingStatement(
141+
sql: 'DELETE FROM todo_lists WHERE id = ?',
142+
params: [
143+
PendingStatementValue.id(),
144+
],
145+
),
146+
),
147+
]);
148+
```
149+
</Tab>
150+
151+
<Tab title="Kotlin">
152+
To define a raw table, include it in the list of tables passed to the `Schema`:
153+
154+
```Kotlin
155+
val schema = Schema(listOf(
156+
RawTable(
157+
// The name here doesn't have to match the name of the table in SQL. Instead, it's used to match
158+
// the table name from the backend database as sent by the PowerSync service.
159+
name = "todo_lists",
160+
put = PendingStatement(
161+
"INSERT OR REPLACE INTO todo_lists (id, created_by, title, content) VALUES (?, ?, ?, ?)",
162+
listOf(
163+
PendingStatementParameter.Id,
164+
PendingStatementParameter.Column("created_by"),
165+
PendingStatementParameter.Column("title"),
166+
PendingStatementParameter.Column("content")
167+
)
168+
),
169+
delete = PendingStatement(
170+
"DELETE FROM todo_lists WHERE id = ?", listOf(PendingStatementParameter.Id)
171+
)
172+
)
173+
))
174+
```
175+
</Tab>
176+
177+
<Tab title="Swift">
178+
Unfortunately, raw tables are not available in the Swift SDK yet.
179+
</Tab>
180+
181+
<Tab title=".NET">
182+
Unfortunately, raw tables are not available in the .NET SDK yet.
183+
</Tab>
184+
185+
</Tabs>
186+
187+
After adding raw tables to the schema, you're also responsible for creating them by executing the
188+
corresponding `CREATE TABLE` statement before `connect()`-ing the database.
189+
190+
#### Collecting local writes on raw tables
191+
192+
PowerSync uses an internal SQLite table to collect local writes. For PowerSync-managed views, a trigger for
193+
insertions, updates and deletions automatically forwards local mutations into this table.
194+
When using raw tables, defining those triggers is your responsibility.
195+
196+
The [PowerSync SQLite extension](https://github.com/powersync-ja/powersync-sqlite-core) creates an insert-only virtual table named `powersync_crud` with these columns:
197+
198+
```SQL
199+
CREATE VIRTUAL TABLE powersync_crud(
200+
-- The type of operation: 'PUT' or 'DELETE'
201+
op TEXT,
202+
-- The id of the affected row
203+
id TEXT,
204+
type TEXT,
205+
-- optional (not set on deletes): The column values for the row
206+
data TEXT,
207+
-- optional: Previous column values to include in a CRUD entry
208+
old_values TEXT,
209+
-- optional: Metadata for the write to include in a CRUD entry
210+
metadata TEXT,
211+
);
212+
```
213+
214+
The virtual table associates local mutations with the current transaction and ensures writes made during the sync
215+
process (applying server-side changes) don't count as local writes.
216+
This means that triggers can be defined on raw tables like so:
217+
218+
```SQL
219+
CREATE TRIGGER todo_lists_insert
220+
AFTER INSERT ON todo_lists
221+
FOR EACH ROW
222+
BEGIN
223+
INSERT INTO powersync_crud (op, id, type, data) VALUES ('PUT', NEW.id, 'todo_lists', json_object(
224+
'created_by', NEW.created_by,
225+
'title', NEW.title,
226+
'content', NEW.content
227+
));
228+
END;
229+
230+
CREATE TRIGGER todo_lists_update
231+
AFTER INSERT ON todo_lists
232+
FOR EACH ROW
233+
BEGIN
234+
INSERT INTO powersync_crud (op, id, type, data) VALUES ('PUT', NEW.id, 'todo_lists', json_object(
235+
'created_by', NEW.created_by,
236+
'title', NEW.title,
237+
'content', NEW.content
238+
));
239+
END;
240+
241+
CREATE TRIGGER todo_lists_delete
242+
AFTER DELETE ON todo_lists
243+
FOR EACH ROW
244+
BEGIN
245+
INSERT INTO powersync_crud (op, id, type) VALUES ('DELETE', OLD.id, 'todo_lists');
246+
END;
247+
```
248+
## Migrations
249+
250+
In PowerSync's [JSON-based view system](/architecture/client-architecture#schema) the client-side schema is applied to the schemaless data, meaning no migrations are required. Raw tables however are excluded from this, so it is the developers responsibility to manage migrations for these tables.
251+
252+
### Adding raw tables as a new table
253+
254+
When you're adding new tables to your sync rules, clients will start to sync data on those tables - even if the tables aren't mentioned
255+
in the client's schema yet.
256+
So at the time you're introducing a new raw table to your app, it's possible that PowerSync has already synced some data for that
257+
table, which would be stored in `ps_untyped`. When adding regular tables, PowerSync will automatically extract rows from `ps_untyped`.
258+
With raw tables, that step is your responsibility. To copy data, run these statements in a transaction after creating the table:
259+
260+
```
261+
INSERT INTO my_table (id, my_column, ...)
262+
SELECT id, data ->> 'my_column' FROM ps_untyped WHERE type = 'my_table';
263+
DELETE FROM ps_untyped WHERE type = 'my_table';
264+
```
265+
266+
This does not apply if you've been using the raw table from the beginning (and never called `connect()` without them) - you only
267+
need this for raw tables you already had locally.
268+
269+
Another workaround is to clear PowerSync data when changing raw tables and opt for a full resync.
270+
271+
### Migrating to raw tables
272+
273+
To migrate from PowerSync-managed tables to raw tables, first:
274+
275+
1. Open the database with the new schema mentioning raw tables. PowerSync will copy data from tables previously managed by PowerSync into
276+
`ps_untyped`.
277+
2. Create raw tables.
278+
3. Run the `INSERT FROM SELECT` statement to insert `ps_untyped` data into your raw tables.
279+
280+
### Migrations on raw tables
281+
282+
When adding new columns to raw tables, there currently isn't a way to re-sync that table to add those columns from the server - we are
283+
investigating possible workarounds and encourage users to try out if they need this.
284+
285+
To ensure the column values are accurate, you'd have to delete all data after a migration and wait for the next complete sync.
286+
287+
## Deleting data and raw tables
288+
289+
APIs that clear an entire PowerSync database, like e.g. `disconnectAndClear()`, don't affect raw tables.
290+
This should be kept in mind when you're using those methods - data from raw tables needs to be deleted explicitly.

0 commit comments

Comments
 (0)