Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sqlite] Add retry logic for SQLITE_BUSY error #915

Closed
aschmidt93 opened this issue Nov 26, 2024 · 3 comments
Closed

[sqlite] Add retry logic for SQLITE_BUSY error #915

aschmidt93 opened this issue Nov 26, 2024 · 3 comments
Labels
enhancement New feature or request sqlite

Comments

@aschmidt93
Copy link
Contributor

In a monorepo setup where multiple processes write to a shared SQLite file, the error SQLITE_BUSY is often thrown. The error is thrown even before the default timeout (5s) of bettersqlite3 is reached. The reason is described in this issue and this issue:

in short this then means that irrespective of any better-sqlite3 internal timeout handling, there can always be cases of such an exception on writes, which applications that can't afford to just error, need to handle by implementing their own rollback+retry.

And from the sqlite3 documentation:

The presence of a busy handler does not guarantee that it will be invoked when there is lock contention. If SQLite determines that invoking the busy handler could result in a deadlock, it will go ahead and return SQLITE_BUSY to the application instead of invoking the busy handler.

Therefore, apps should implement a retry logic. However, CAP apps are and should be agnostic of the underlying database. Therefore, I suggest adding a retry logic for SQLITE_BUSY to @cap-js/sqlite instead.

@patricebender patricebender added enhancement New feature or request sqlite labels Nov 26, 2024
@aschmidt93
Copy link
Contributor Author

The following example shall describe the underlying problem and a potential solution. Consider a monorepo consisting of multiple modules (CAP apps) where apps share a database. In development SQLite is used (although its weakest point is concurrency...). All apps write their logs to a logs table. Many write attempts will fail with the error 'SQLITE_BUSY' inside _run of SQLiteService.

Problem with bettersqlite3 timeout

However, the error is thrown only after 5s as this is the default timeout of bettersqlite3. During that time the DB is blocked for everyone. This is of course not desired. For this reason increasing the timeout does not solve the problem. Instead, decreasing the timeout (~10ms) with the following changes seemed to help.

BEGIN TRANSACTION

The error is thrown inside _run because transactions are initiated with BEGIN which only starts the transaction when the database is first accessed. In a concurrent context BEGIN IMMEDIATE is more suitable as this immediately throws SQLITE_BUSY which we then can use for retry logic.

Retry logic

Now we can handle SQLITE_BUSY error when BEGIN IMMEDIATE fails and retry for some time.

Changes to SQLiteService

After playing a bit around, the following changes yielded the best results. This may not be optimal solution but a starting point. Ignore the cds.logs, just for debugging purpose.

// add custom timeout
const dbc = new sqlite(database, { timeout: 10 })
  async exec(sql) {
    // use BEGIN IMMEDIATE
    if (sql === "BEGIN") {
      sql += " IMMEDIATE"
    }

    let counter = 0;
    let totalWaitTime = 0;
    const MAX_TOTAL_WAIT_TIME = 20000;

    while (totalWaitTime < MAX_TOTAL_WAIT_TIME) {
      try {
        const result = await this.dbc.exec(sql)
        if (counter > 0) {
          cds.log("SQLITE").error(`Retry #${counter} finally successfull after ${totalWaitTime}ms`);
        }
        return result;
      } catch (error) {
        if (error.code !== "SQLITE_BUSY" && error.code !== "SQLITE_BUSY_SNAPSHOT") {
          throw error;
        }

        counter++;
        const waitTime = Math.floor(Math.random() * (100 - 20 + 1)) + 20;
        await new Promise((res) => setTimeout(res, waitTime));
        totalWaitTime += waitTime;
        cds.log("SQLITE").warn(`Retry #${counter}, waited ${waitTime}ms total ${totalWaitTime}ms`);
      }
    }

    throw new Error("Database locked after multiple retries");
  }

The following options should probably be customizable by users:

  • bettersqlite3 timeout
  • Whether to use BEGIN IMMEDIATE
  • MAX_TOTAL_WAIT_TIME
  • minWaitTime, maxWaitTIme

@aschmidt93
Copy link
Contributor Author

aschmidt93 commented Nov 28, 2024

Retry-Logic can also be added without modifying @cap-js/sqlite. For this the SQLiteService class needs to be extended and the exec method overriden:

const SQLiteService = require('@cap-js/sqlite');

class SQLiteServiceExt extends SQLiteService {
  async exec(sql) {
    // custom retry logic
  }
}

module.exports = SQLiteServiceExt

And in cds.requires.db:

    "db": {
      "[development]": {
        "kind": "sqlite",
        "impl": "./srv/db-sqlite.js",
        "credentials": {
          "url": "../db/db.sqlite"
        }
      }
    },

However, there's currently no way to pass options (timeout) to the constructor of bettersqlite3. Therefore, it is required that options can be specified in the cds.requires.db section, e.g. in cds.requires.db.sqlite, which are then passed to the constructor of bettersqlite3.

Example:

    "db": {
      "[development]": {
        "kind": "sqlite",
        "impl": "./srv/db-sqlite.js",
        "credentials": {
          "url": "../db/db.sqlite"
        },
        "sqlite": {
          "timeout": 10
        }
      }
    },

Change required in SQLiteService.js, see issue #920

const dbc = new sqlite(database, this.options.sqlite)

edit: Setting the timeout of bettersqlite3 can also be achieved without passing options in the constructor, by overriding the factory-getter

  get factory() {
    const superResult = super.factory;
    return {
      ...superResult,
      create: tenant => {
        const dbc = superResult.create(tenant);
        dbc.pragma('busy_timeout = 10');
        return dbc;
      }
    }
  }

@aschmidt93
Copy link
Contributor Author

As this can be achieved with extending the SQLiteService class, adding this to @cap-js/sqlite does no longer seem necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request sqlite
Projects
None yet
Development

No branches or pull requests

2 participants