Skip to content
This repository has been archived by the owner on Jul 15, 2021. It is now read-only.

Release v1.0.0

Latest
Compare
Choose a tag to compare
@nwronski nwronski released this 21 Jan 19:10
· 3 commits to master since this release

Added

  • The root node of the AST now has type and variant properties:

    {
      "type": "statement",
      "variant": "list",
      "statement": [{
        "type": "statement",
        "variant": "select",
        "statement": {}
      }]
    }
  • There is now a command line version of the parser when it is installed as a global module (e.g., npm i -g sqlite-parser). The sqlite-parser command is then available to use to parse input SQL files and write the results to stdout or a JSON file. Additional usage instructions and options available through sqlite-parser --help.

    sqlite-parser input.sql --output foo.json
    
  • To allow users to parse arbitrarily long SQL files or other readable stream sources, there is now a stream transform that can accept a readable stream and then push (write) out JSON strings of the ASTs for individual statements.

    • The AST for each statement is pushed down the stream as soon as it is read and parsed instead of reading the entire file into memory before parsing begins.

      var parserTransform = require('sqlite-parser').createParser();
      var readStream = require('fs').createReadStream('./large-input-file.sql');
      
      readStream.pipe(parserTransform);
      parserTransform.pipe(process.stdout);
      
      parserTransform.on('error', function (err) {
        console.error(err);
        process.exit(1);
      });
      
      parserTransform.on('finish', function () {
        process.exit(0);
      });
    • To pipe the output into a file that contains a single valid JSON structure, the output of the parser steam transform needs to be wrapped in statement list node where every statement is separated by a comma.

      var fs = require('fs');
      var sqliteParser = require('sqlite-parser');
      var parserTransform = sqliteParser.createParser();
      var singleNodeTransform = sqliteParser.createStitcher();
      var readStream = fs.createReadStream('./large-input-file.sql');
      var writeStream = fs.createWriteStream('./large-output-file.json');
      
      readStream.pipe(parserTransform);
      parserTransform.pipe(singleNodeTransform);
      singleNodeTransform.pipe(writeStream);
      
      parserTransform.on('error', function (err) {
        console.error(err);
        process.exit(1);
      });
      
      writeStream.on('finish', function () {
        process.exit(0);
      });
  • Added missing ATTACH DATABASE statement. It will pair nicely with the existing DETACH DATABASE statement.

    ATTACH DATABASE 'bees2.db' AS more_bees
  • SQLite allows you to enter basically anything you want for a datatype, such as the datatype for a column in a CREATE TABLE statement, because it doesn't enforce types you provide. So, the parser will accept almost any unquoted string in place of a datatype. ref1 ref2

    CREATE TABLE t1(x DINOSAUR, y BIRD_PERSON);
  • Run parser against entire SQLite test corpus using grunt testall command.

    • Warning: This command will parse ~49,000 of queries, across almost 900 different files, representing the entire SQLite test corpus at the time it was processed.
  • Allow multi-byte UTF-8 characters (e.g., \u1234) in identifier names.

  • Add support for table functions in the FROM clause of a SELECT statement.

    SELECT
      j2.rowid, jx.rowid
    FROM
      j2, json_tree(j2.json) AS jx

Changed

  • BREAKING CHANGE Instead of publishing this module on npm as a browserified and minified bundle, The transpiled ES2015 code in lib/ is now published and I have left it up to the end user to decide if they want to browserify or minify the library. The combined unminified file sizes for the published version of the parser is now ~127kB.

    • There is still a dist/ folder containing the minified browserified bundle that comes in at ~81kB (7% reduction from v0.14.5). This is defined in the package.json as the browser version of the module, which is recognized by tools such as jspm and browserify.
  • BREAKING CHANGE The on property of a CREATE INDEX statement is now treated as a table expression identifier, and has the corresponding type and variant:

    {
      "type": "statement",
      "variant": "create",
      "format": "index",
      "target": {
        "type": "identifier",
        "variant": "index",
        "name": "bees.hive_state"
      },
      "on": {
        "type": "identifier",
        "variant": "expression",
        "format": "table",
        "name": {
          "type": "identifier",
          "variant": "table",
          "name": "hive"
        },
        "columns": []
      }
    }
  • BREAKING CHANGE Indexed columns (e.g., the column list in the ON part of a CREATE INDEX statement) and ordering expressions (e.g., the ORDER BY part of a SELECT statement) now have the following format:

    • When they are proceeded by an ordering term (e.g., ASC, DESC) and/or COLLATE, such as ORDER BY nick ASC
    {
      "order": [{
        "type": "expression",
        "variant": "order",
        "expression": {
          "type": "identifier",
          "variant": "column",
          "name": "nick"
        },
        "direction": "asc"
      }]
    }
    • But, when it is only an expression or column name without any ordering term or COLLATE then it will only be the expression itself, and the implicit "direction": "asc" will not be added to the AST, such as ORDER BY nick:
    {
      "order": [{
        "type": "identifier",
        "variant": "column",
        "name": "nick"
      }]
    }
  • BREAKING CHANGE Because of changes to how binary expressions are parsed, the order that expressions are composed may be different then the previous release. For example, ASTs may change such as those for queries that contain multiple binary expressions:

    SELECT *
    FROM hats
    WHERE x != 2 OR x == 3 AND y > 5
  • BREAKING CHANGE Expressions such as x NOT NULL were previously treated as a unary expressions but are now considered binary expressions.

    {
      "type": "expression",
      "format": "binary",
      "variant": "operation",
      "operation": "not",
      "left": {
        "type": "identifier",
        "variant": "column",
        "name": "x"
      },
      "right": {
        "type": "literal",
        "variant": "null",
        "value": "null"
      }
    }
  • BREAKING CHANGE Now, instead of transaction statements being parsed as a single statement of type transaction to be considered valid, each statement that makes up a the transaction (e.g., BEGIN, END) is considered its own distinct statement that can exist independent of the others. Because a single transaction can be spread across multiple input strings given to the parser, it is no longer treated as a single, large, transaction statement.

    BEGIN;
    DROP TABLE t1;
    END;
    {
      "type": "statement",
      "variant": "list",
      "statement": [
        {
          "type": "statement",
          "variant": "transaction",
          "action": "begin"
        },
        {
          "type": "statement",
          "target": {
            "type": "identifier",
            "variant": "table",
            "name": "t1"
          },
          "variant": "drop",
          "format": "table",
          "condition": []
        },
        {
          "type": "statement",
          "variant": "transaction",
          "action": "commit"
        }
      ]
    }
  • BREAKING CHANGE COLLATE can now appear multiple times in a row wherever it would previously be allowed to appear, and as a result, the collate property of the AST will contain an array.

    SELECT 'cats'
    ORDER BY 1 COLLATE nocase COLLATE nocase
  • BREAKING CHANGE CONSTRAINT names can now appear multiple times before or after a column or table constraint in a CREATE TABLE statement. Having a CONSTRAINT name after the constraint is an undocumented SQLite feature. However, while it will not give an error, any constraint name appearing after the constraint is ignored.

    CREATE TABLE t2c(
      -- Two leading and two trailing CONSTRAINT clauses
      -- Name used: x_two
      x INTEGER CONSTRAINT x_one CONSTRAINT x_two
        CHECK( typeof( coalesce(x,0) ) == 'integer' )
        CONSTRAINT x_two CONSTRAINT x_three,
      y INTEGER,
      z INTEGER,
      -- Two trailing CONSTRAINT clauses
      -- Name used: (none)
      UNIQUE(x, y, z) CONSTRAINT u_one CONSTRAINT u_two
    )
  • BREAKING CHANGE JOIN clauses and table lists can now occur in the same FROM clause of a single SELECT statement. Tables separated by a comma will be included in the JOIN map as a cross join.

    SELECT *
    FROM aa LEFT JOIN bb, cc
    WHERE cc.c = aa.a;
  • BREAKING CHANGE A comma-separated list of table or subquery names in the FROM clause of a SELECT statement are now treated as a join map of cross joins. Also, each pair of comma-separated tables or subqueries can include a join constraint expression (e.g., ON t.col1 = b.col2).

    SELECT t1.rowid, t2.rowid
    FROM t1, t2 ON t1.a = t2.b;
    {
      "type": "statement",
      "variant": "list",
      "statement": [
        {
          "type": "statement",
          "variant": "select",
          "result": [
            {
              "type": "identifier",
              "variant": "column",
              "name": "t1.rowid"
            },
            {
              "type": "identifier",
              "variant": "column",
              "name": "t2.rowid"
            }
          ],
          "from": {
            "type": "map",
            "variant": "join",
            "source": {
              "type": "identifier",
              "variant": "table",
              "name": "t1"
            },
            "map": [
              {
                "type": "join",
                "variant": "cross join",
                "source": {
                  "type": "identifier",
                  "variant": "table",
                  "name": "t2"
                },
                "constraint": {
                  "type": "constraint",
                  "variant": "join",
                  "format": "on",
                  "on": {
                    "type": "expression",
                    "format": "binary",
                    "variant": "operation",
                    "operation": "=",
                    "left": {
                      "type": "identifier",
                      "variant": "column",
                      "name": "t1.a"
                    },
                    "right": {
                      "type": "identifier",
                      "variant": "column",
                      "name": "t2.b"
                    }
                  }
                }
              }
            ]
          }
        }
      ]
    }
  • BREAKING CHANGE Instead of an array, for the args property of an AST node, it will now contain an expression list node containing the arguments on the expression property.

    {
      "type": "expression",
      "variant": "list",
      "expression": []
    }
  • BREAKING CHANGE All named values for properties such as variant, format, and type should always be lowercase, even when uppercase in the input SQL (e.g., variant is now natural join instead of NATURAL JOIN in the AST).

  • BREAKING CHANGE New format for CASE expression AST nodes:

    • variant when has a condition and a consequent
    • variant else has just a consequent
    • the outer expression is now variant case instead of binary
    • instead of taking whatever is provided between CASE and WHEN (e.g., CASE foo WHEN ...) and calling that the expression, it is now added as the discriminant
    select
        case acc
          when a = 0 then a1
          when a = 1 then b1
          else c1
        end
    {
      "type": "expression",
      "variant": "case",
      "expression": [
        {
          "type": "condition",
          "variant": "when",
          "condition": {
            "type": "expression",
            "format": "binary",
            "variant": "operation",
            "operation": "=",
            "left": {
              "type": "identifier",
              "variant": "column",
              "name": "a"
            },
            "right": {
              "type": "literal",
              "variant": "decimal",
              "value": "0"
            }
          },
          "consequent": {
            "type": "identifier",
            "variant": "column",
            "name": "a1"
          }
        },
        {
          "type": "condition",
          "variant": "when",
          "condition": {
            "type": "expression",
            "format": "binary",
            "variant": "operation",
            "operation": "=",
            "left": {
              "type": "identifier",
              "variant": "column",
              "name": "a"
            },
            "right": {
              "type": "literal",
              "variant": "decimal",
              "value": "1"
            }
          },
          "consequent": {
            "type": "identifier",
            "variant": "column",
            "name": "b1"
          }
        },
        {
          "type": "condition",
          "variant": "else",
          "consequent": {
            "type": "identifier",
            "variant": "column",
            "name": "c1"
          }
        }
      ],
      "discriminant": {
        "type": "identifier",
        "variant": "column",
        "name": "acc"
      }
    }
  • BREAKING CHANGE New format for EXISTS expression nodes. Use expression node in condition for IF NOT EXISTS. NOT EXISTS, and EXISTS modifiers instead of a string value.

    • CREATE TABLE statement

      create table if not exists foo(id int)
      {
        "type": "statement",
        "name": {
          "type": "identifier",
          "variant": "table",
          "name": "foo"
        },
        "variant": "create",
        "format": "table",
        "definition": [
          {
            "type": "definition",
            "variant": "column",
            "name": "id",
            "definition": [],
            "datatype": {
              "type": "datatype",
              "variant": "int",
              "affinity": "integer"
            }
          }
        ],
        "condition": [
          {
            "type": "condition",
            "variant": "if",
            "condition": {
              "type": "expression",
              "variant": "exists",
              "operator": "not exists"
            }
          }
        ]
      }
    • DROP TABLE statement

      drop table if exists foo
      {
        "type": "statement",
        "target": {
          "type": "identifier",
          "variant": "table",
          "name": "foo"
        },
        "variant": "drop",
        "format": "table",
        "condition": [
          {
            "type": "condition",
            "variant": "if",
            "condition": {
              "type": "expression",
              "variant": "exists",
              "operator": "exists"
            }
          }
        ]
      }
    • NOT EXISTS expression

      select a
      where not exists (select b)
      {
        "type": "statement",
        "variant": "select",
        "result": [
          {
            "type": "identifier",
            "variant": "column",
            "name": "a"
          }
        ],
        "where": [
          {
            "type": "expression",
            "format": "unary",
            "variant": "exists",
            "expression": {
              "type": "statement",
              "variant": "select",
              "result": [
                {
                  "type": "identifier",
                  "variant": "column",
                  "name": "b"
                }
              ]
            },
            "operator": "not exists"
          }
        ]
      }
  • RangeError: Maximum call stack size exceeded generated when running the uglified bundle (dist/sqlite-parser.js) in the browser, so I am skipping the minification step and only publishing the browserified bundle.

Fixed

  • Fixed binary expression parsing logic so that it can handle expressions such as:

    SELECT * FROM t where -1 * (2 + 3);
    SELECT * FROM t where 3 + 4 * 5 > 20;
    SELECT * FROM t where v1 = ((v2 * 5) - v3);
  • Allow qualified table name in ON clause of CREATE TRIGGER statement (e.g., ON dbName.tableName).

  • Allow literal boolean values on and off in PRAGMA statements:

    PRAGMA legacy_file_format = ON;
  • Do not treat TEMP or ROWID as reserved words, since the official parser allows temp or rowid, when used as an identifier (e.g., a table named temp or the rowid column of a table).

    CREATE TABLE temp.t1(a, b);
    SELECT rowid AS "Internal Row ID" FROM bees;
  • Require semicolons to delineate BEGIN and END statements for transactions while also allowing unnecessary semicolons to be omitted:

    BEGIN;CREATE TABLE nick(a, b);END
  • Only allow CRUD operations inside of the body of a CREATE TRIGGER statement.

  • Allow empty strings or NULL to be used as aliases, to match behavior of the native SQLite parser, such as in an ATTACH DATABASE statement:

    ATTACH DATABASE '' AS ''
  • Allow datatype names to be provided to COLLATE to match the behavior of the official SQLite parser:

    SELECT c1
    FROM t
    ORDER BY 1 COLLATE numeric
  • Some CREATE TRIGGER statements were previously parsed as a binary expressions instead of create trigger statements.

  • Allow indexed columns to be parsed when they include a COLLATE and/or a ordering direction (e.g., ASC, DESC) when part of a table constraint in a CREATE TABLE statement or a ON part of a CREATE INDEX statement:

    CREATE TABLE t1(id int, PRIMARY KEY(x COLLATE binary ASC, y COLLATE hex, z DESC))
  • Allow UNIQUE column constraint type to be correctly parsed.

    CREATE TABLE bees(
      a INTEGER UNIQUE ON CONFLICT IGNORE
    )
  • Allow nested unary expressions while preserving also the correct order of precedence.

    SELECT not not foo
    FROM bees
  • The action (e.g., ADD COLUMN) and target (e.g., the table name) of a ALTER TABLE statement was not being added to the AST.

  • Allow AUTOINCREMENT in the column list of a PRIMARY KEY table constraint.

    CREATE TABLE t7(
      x INTEGER,
      y REAL,
      PRIMARY KEY(x AUTOINCREMENT)
    );
  • Now supporting custom datatypes with affinity inference where possible. See this page for explanation for choosing type affinity for custom types.

    CREATE TABLE t3(
      -- Affinity: NUMERIC
      d STRING,
      -- Affinity: INTEGER
      e FLOATING POINT(5,10),
      -- Affinity: TEXT
      f NATIONAL CHARACTER(15) COLLATE RTRIM,
      -- Affinity: INTEGER
      g LONG INTEGER DEFAULT( 3600*12 )
    );
  • Allow trailing . in decimal value (e.g., SELECT 1. + 1).

  • Allow functions to have datatype names such as date(arg) or time(now).

  • Allow reserved words in the a VIRTUAL TABLE statement USING clause CTE columns (e.g., from, to).

  • Better nested expression parsing when unnecessary parenthesis are used within a complex expression.

    SELECT
      SUM(
        CASE
          WHEN ( t.color != 'yellow' ) THEN 1
          ELSE 0
        END
      ) AS imposter_bee_count
    FROM
      bees t;
  • Allow a reserved word to be used as a column name in a CREATE TABLE statement as long as it can be safely implied that it is meant to be a column name.

    CREATE TABLE changelog(
      desc TEXT
    );
  • Allow WITH clause before a SELECT statement wherever a SELECT statement can be found in a complex query, such as in a insert into select query.

    INSERT INTO t6
    WITH s(x) AS ( VALUES (2) UNION ALL SELECT x + 2 FROM s WHERE x < 49 )
    SELECT * FROM s;
  • A view expression can now be used in a CREATE VIEW statement.

    CREATE VIEW v1(a, b) AS VALUES(1, 2), (3, 4);