Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlite: expose backup api #56253

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Conversation

geeksilva97
Copy link
Contributor

@geeksilva97 geeksilva97 commented Dec 14, 2024

Closes #55413

This PR exposes the SQLite Online Backup API, which allows database backup.

The API is inspired by better-sqlite3 https://github.com/WiseLibs/better-sqlite3/blob/master/docs/api.md#backupdestination-options---promise.

Multithreading caveats

As long as writes come from the same process and handle (sqlite*), the backup will continue progressing as expected. Other than that, it can cause the backup process to restart. From docs:

...If the source database is not an in-memory database, and the write is performed from within the same process as the backup operation and uses the same database handle (pDb), then the destination database (the one opened using connection pFile) is automatically updated along with the source.

Writes to an in-memory source database, or writes to a file-based source database by an external process or thread using a database connection other than pDb are significantly more expensive than writes made to a file-based source database using pDb (as the entire backup operation must be restarted...)

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. sqlite Issues and PRs related to the SQLite subsystem. labels Dec 14, 2024
@geeksilva97 geeksilva97 force-pushed the sqlite-backup branch 3 times, most recently from 6c61b4c to 9b80c2d Compare January 10, 2025 21:43
lib/sqlite.js Outdated Show resolved Hide resolved
@geeksilva97 geeksilva97 force-pushed the sqlite-backup branch 4 times, most recently from 632edd3 to 174ace5 Compare January 12, 2025 01:07
@geeksilva97 geeksilva97 changed the title [wip] sqlite: expose backup api sqlite: expose backup api Jan 12, 2025
@geeksilva97 geeksilva97 marked this pull request as ready for review January 12, 2025 01:29
@geeksilva97
Copy link
Contributor Author

geeksilva97 commented Jan 12, 2025

This comment mentions that DatabaseSync needs to have only sync operations (makes sense).

The backup can be performed synchronously but I don't think it should be like that.

I wonder if indeed this PR would be more suitable for async API.

Any thoughts?

Thanks in advance

@geeksilva97 geeksilva97 force-pushed the sqlite-backup branch 3 times, most recently from 69a3bf3 to bd43083 Compare January 12, 2025 02:19
Copy link

codecov bot commented Jan 12, 2025

Codecov Report

Attention: Patch coverage is 81.42857% with 39 lines in your changes missing coverage. Please review.

Project coverage is 89.20%. Comparing base (808e6b3) to head (7b15cae).
Report is 88 commits behind head on main.

Files with missing lines Patch % Lines
src/node_sqlite.cc 81.42% 16 Missing and 23 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #56253      +/-   ##
==========================================
+ Coverage   89.18%   89.20%   +0.02%     
==========================================
  Files         662      662              
  Lines      191759   192178     +419     
  Branches    36911    36992      +81     
==========================================
+ Hits       171017   171439     +422     
+ Misses      13613    13562      -51     
- Partials     7129     7177      +48     
Files with missing lines Coverage Δ
src/node_sqlite.h 70.00% <ø> (ø)
src/node_sqlite.cc 79.92% <81.42%> (+0.27%) ⬆️

... and 79 files with indirect coverage changes

@geeksilva97 geeksilva97 marked this pull request as draft January 12, 2025 13:43
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
doc/api/sqlite.md Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
src/node_sqlite.cc Outdated Show resolved Hide resolved
@geeksilva97 geeksilva97 force-pushed the sqlite-backup branch 2 times, most recently from 371f368 to a92e692 Compare January 16, 2025 00:59
@geeksilva97
Copy link
Contributor Author

geeksilva97 commented Jan 20, 2025

Failing test seems unrelated

src/env_properties.h Outdated Show resolved Hide resolved
@geeksilva97
Copy link
Contributor Author

@jasnell @cjihrig do you think I need to check any memory leak using valgrind here? If so, is there a command in node that would allow me to do so? I saw some make test-valgrind but I don't know how it should be used.

Thanks in advance

@cjihrig
Copy link
Contributor

cjihrig commented Jan 20, 2025

do you think I need to check any memory leak using valgrind here?

That is not a requirement.

doc/api/sqlite.md Outdated Show resolved Hide resolved
test/parallel/test-sqlite-backup.mjs Outdated Show resolved Hide resolved
-->

* `sourceDb` {DatabaseSync} The database to backup. The source database must be open.
* `destination` {string} The path where the backup will be created. If the file already exists, the contents will be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is assumed to be a filesystem path, then handling it the same as our other fs APIs would be ideal. Paths can be expressed as either strings, Buffer, or file:// scheme URLs. If you're not familiar, there are internal utilities for normalizing these into a usable path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I will search for examples to see how to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* `options` {Object} Optional configuration for the backup. The
following properties are supported:
* `source` {string} Name of the source database. **Default:** `'main'`.
* `target` {string} Name of the target database. **Default:** `'main'`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For users who may not be that familiar with things here, the docs could likely benefit from more explanation about how source and target are used.

* Returns: {Promise} A promise that resolves when the backup is completed and rejects if an error occurs.

This method makes a database backup. This method abstracts the [`sqlite3_backup_init()`][], [`sqlite3_backup_step()`][]
and [`sqlite3_backup_finish()`][] functions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the backup operation is async and may take some time, some discussion in here about concurrent edits while the db is being backed up would be helpful for users to understand what to expect.

* `source` {string} Name of the source database. **Default:** `'main'`.
* `target` {string} Name of the target database. **Default:** `'main'`.
* `rate` {number} Number of pages to be transmitted in each batch of the backup. **Default:** `100`.
* `progress` {Function} Callback function that will be called with the number of pages copied and the total number of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not something that needs to be added in this PR but... would it make sense for this operation to support cancelation via an AbortSignal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

const database = makeSourceDb();

t.assert.throws(() => {
backup(database);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per web platform API conventions, APIs that return a Promise should generally reject the promsie as opposed to throwing synchronously. I know we haven't always been super consistent with that for Node.js APIs tho. Non-blocking but I think I would prefer this to reject rather than throw.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this an input validation issue though? Node APIs generally throw in this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's why it's not blocking and just a preference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. sqlite Issues and PRs related to the SQLite subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

node:sqlite: support database.backup
5 participants