Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert extensions to bundles #281

Open
erichanson opened this issue Nov 16, 2023 · 2 comments
Open

Convert extensions to bundles #281

erichanson opened this issue Nov 16, 2023 · 2 comments
Labels
epic big meta-ticket that is more of an outcome than a specific task refactor

Comments

@erichanson
Copy link
Member

erichanson commented Nov 16, 2023

Aquameta's extensions, except for meta and bundle (the bootstrap), should be converted to bundles. The extension-to-bundle.sql script is the first of such experiments, but it can be a lot better. Using techniques learned from the meta id generator, we can do some incredible stuff.

Converting extensions to bundles will mean that we can have robust bundle dependencies, and in the future, do all Aquameta development within the bundle VCS, including core. It means everything will be data except for the bootstrap. We'll be free of the extension system's many shortcomings. It means future updates to Aquameta will just be another bundle pull.

General approach:

  • Call function extension_to_bundle(extension_name, bundle_name):
    • Create a new bundle $bundle_name
    • Look up the contents of the extension
      • For aquameta's extensions, they're simply divided by schema name, but this isn't true for every extension in the world. Do we care?
      • We could use pg_catalog.pg_depends which is where the contents of an extension is stored (by oid :/)
    • Find those rows in the meta catalog
    • Track, stage and commit those rows
  • Drop the extension
  • Checkout the bundle

Open questions:

  • PostgreSQL doesn't retain the original source code for, say, a CREATE TABLE statement. You can get some of the entity definitions back out, but it's not the original source code, but rather PostgreSQL's internal representation, then re-serialized back out. (plpgsql functions are an exception). When everything is data, where is the source code? We probably want to retain that somewhere.
@erichanson erichanson added epic big meta-ticket that is more of an outcome than a specific task refactor labels Nov 16, 2023
@erichanson
Copy link
Member Author

erichanson commented Nov 21, 2023

Getting close. Running the extension-to-bundle.sql script calls extension_to_bundle() on each extension except meta and bundle, and outputs below.

After the conversion to bundles, I deleted all the bundled schemas and recreated their extensions, which gives a really nice diff between what's in the bundle and what the extension creates.

Remaining issues:

  • function parameters are changing order
  • Functions lose OUT parameters
  • A few functions aren't being created, because they are missing a dependency function that is installed later -- lack of dependency tracking
  • Functions that take arrays as arguments are geting tyeir type signature fouled up. information_schema.parameters is really useless and awful in this space
  • Views are showing up as changed in offstage_field_changed, because their query definition changes depending on search_path. Set search_path to empty before checking the working copy for changes. Not sure we can even force this with from within a view
  • untracked_row contains every pg object in bundle and meta. As it should. But this has to be addressed. When a schema or relation is ignored, its non-meta representation doesn't show up in meta, but it's meta row still does. Meta either sees everything or nothing.
  • head_commit_row_with_exists and head_db_stage_changed are very slow. The expensive part here is the check to see if the row exists in the working copy. Gotta figure something out.
  • meta.constraint_unique and meta.constraint_check views are not being imported yet.
  • meta.type only supports insert/update/delete of enums and composite types.
Null display is "💩".
Timing is on.
Timing is on.
BEGIN
Time: 0.602 ms
CREATE EXTENSION
Time: 31.941 ms
CREATE EXTENSION
Time: 3.067 ms
CREATE EXTENSION
Time: 5.236 ms
CREATE EXTENSION
Time: 81.313 ms
CREATE EXTENSION
Time: 28.216 ms
CREATE EXTENSION
Time: 70.854 ms
SET
Time: 0.524 ms
INSERT 0 12
Time: 0.661 ms
CREATE FUNCTION
Time: 1.242 ms
NOTICE:  extension_to_bundle(): Converting extension event to bundle org.aquameta.ext.event
NOTICE:  bundle: Committing to org.aquameta.ext.event
NOTICE:  bundle: Committing rowset_rows...
NOTICE:  bundle: Committing blobs...
NOTICE:  bundle: Committing stage_row_fields...
NOTICE:  bundle: Creating the commit...
NOTICE:  bundle: Updating bundle.head_commit_id...
NOTICE:  bundle: Cleaning up after commit...
NOTICE:  bundle.checkout(): org.aquameta.ext.event / e20b5add-6d79-464c-871d-a83645529309 @ 2023-11-21 05:26:07.745313 by <NULL>: "initial import"
 extension_to_bundle
---------------------

(1 row)

Time: 36065.190 ms (00:36.065)
NOTICE:  extension_to_bundle(): Converting extension endpoint to bundle org.aquameta.ext.endpoint
NOTICE:  bundle: Committing to org.aquameta.ext.endpoint
NOTICE:  bundle: Committing rowset_rows...
NOTICE:  bundle: Committing blobs...
NOTICE:  bundle: Committing stage_row_fields...
NOTICE:  bundle: Creating the commit...
NOTICE:  bundle: Updating bundle.head_commit_id...
NOTICE:  bundle: Cleaning up after commit...
NOTICE:  bundle.checkout(): org.aquameta.ext.endpoint / f570bd07-3253-404f-a87d-3b4914689fd7 @ 2023-11-21 05:26:07.745313 by <NULL>: "initial import"
NOTICE:  CHECKOUT EXCEPTION checking out (meta,function,id,"(endpoint,path_to_function_id,{text})"): function endpoint.urldecode_arr(text) does not exist
NOTICE:  CHECKOUT EXCEPTION checking out (meta,function,id,"(endpoint,column_list,""{text,text,text,pg_catalog._text,pg_catalog._text}"")"): "column_list" is not a known variable
NOTICE:  CHECKOUT EXCEPTION checking out (meta,function,id,"(endpoint,rows_select_function,""{meta.function_id,json}"")"): "mimetype" is not a known variable
NOTICE:  CHECKOUT EXCEPTION checking out (meta,function,id,"(endpoint,field_select,{meta.field_id})"): "mimetype" is not a known variable
NOTICE:  CHECKOUT EXCEPTION checking out (meta,function,id,"(endpoint,columns_json,""{text,text,pg_catalog._text,pg_catalog._text}"")"): "json" is not a known variable
NOTICE:  CHECKOUT EXCEPTION checking out (meta,function,id,"(endpoint,anonymous_rows_select_function,""{text,text,json}"")"): variable "mimetype" does not exist
 extension_to_bundle
---------------------

(1 row)

Time: 88972.754 ms (01:28.973)
NOTICE:  extension_to_bundle(): Converting extension widget to bundle org.aquameta.ext.widget
NOTICE:  bundle: Committing to org.aquameta.ext.widget
NOTICE:  bundle: Committing rowset_rows...
NOTICE:  bundle: Committing blobs...
NOTICE:  bundle: Committing stage_row_fields...
NOTICE:  bundle: Creating the commit...
NOTICE:  bundle: Updating bundle.head_commit_id...
NOTICE:  bundle: Cleaning up after commit...
NOTICE:  bundle.checkout(): org.aquameta.ext.widget / 893fcc7a-0e5b-4613-a41b-d607975a8881 @ 2023-11-21 05:26:07.745313 by <NULL>: "initial import"
 extension_to_bundle
---------------------

(1 row)

Time: 29897.816 ms (00:29.898)
NOTICE:  extension_to_bundle(): Converting extension semantics to bundle org.aquameta.ext.semantics
NOTICE:  bundle: Committing to org.aquameta.ext.semantics
NOTICE:  bundle: Committing rowset_rows...
NOTICE:  bundle: Committing blobs...
NOTICE:  bundle: Committing stage_row_fields...
NOTICE:  bundle: Creating the commit...
NOTICE:  bundle: Updating bundle.head_commit_id...
NOTICE:  bundle: Cleaning up after commit...
NOTICE:  bundle.checkout(): org.aquameta.ext.semantics / d0d90269-e733-4cf5-ba6c-58cadddcc4e5 @ 2023-11-21 05:26:07.745313 by <NULL>: "initial import"
 extension_to_bundle
---------------------

(1 row)

Time: 29187.992 ms (00:29.188)
NOTICE:  extension_to_bundle(): Converting extension ide to bundle org.aquameta.ext.ide
NOTICE:  bundle: Committing to org.aquameta.ext.ide
NOTICE:  bundle: Committing rowset_rows...
NOTICE:  bundle: Committing blobs...
NOTICE:  bundle: Committing stage_row_fields...
NOTICE:  bundle: Creating the commit...
NOTICE:  bundle: Updating bundle.head_commit_id...
NOTICE:  bundle: Cleaning up after commit...
NOTICE:  bundle.checkout(): org.aquameta.ext.ide / e85ad162-0be2-4b61-a18d-3da7f91c84b0 @ 2023-11-21 05:26:07.745313 by <NULL>: "initial import"
 extension_to_bundle
---------------------

(1 row)

Time: 5623.233 ms (00:05.623)
NOTICE:  extension_to_bundle(): Converting extension documentation to bundle org.aquameta.ext.documentation
NOTICE:  bundle: Committing to org.aquameta.ext.documentation
NOTICE:  bundle: Committing rowset_rows...
NOTICE:  bundle: Committing blobs...
NOTICE:  bundle: Committing stage_row_fields...
NOTICE:  bundle: Creating the commit...
NOTICE:  bundle: Updating bundle.head_commit_id...
NOTICE:  bundle: Cleaning up after commit...
NOTICE:  bundle.checkout(): org.aquameta.ext.documentation / 6054192c-64a4-48df-a67d-3aab9982603a @ 2023-11-21 05:26:07.745313 by <NULL>: "initial import"
 extension_to_bundle
---------------------

(1 row)

Time: 7631.344 ms (00:07.631)
COMMIT
Time: 19.276 ms

@erichanson
Copy link
Member Author

Timing on bundle views, with everything installed:

dev@aquameta:~/dev/aquameta/experimental$ cat view-timing.sql | psql -e
select count(*) from _bundle_blob;
 count
-------
  5620
(1 row)

Time: 243.264 ms
select count(*) from head_commit_field;
 count
-------
  6400
(1 row)

Time: 26.291 ms
select count(*) from head_commit_row;
 count
-------
  2060
(1 row)

Time: 2.947 ms
select count(*) from head_commit_row_with_exists;
 count
-------
  2060
(1 row)

Time: 518.459 ms
select count(*) from head_db_stage;
 count
-------
  2075
(1 row)

Time: 3995.566 ms (00:03.996)
select count(*) from head_db_stage_changed;
 count
-------
    15
(1 row)

Time: 3842.206 ms (00:03.842)
select count(*) from not_ignored_row_stmt;
 count
-------
    52
(1 row)

Time: 54.009 ms
select count(*) from offstage_field_changed;
 count
-------
     0
(1 row)

Time: 1366.073 ms (00:01.366)
select count(*) from offstage_row_deleted;
 count
-------
     0
(1 row)

Time: 506.547 ms
select count(*) from offstage_row_deleted_by_relation;
 count
-------
     0
(1 row)

Time: 485.427 ms
select count(*) from offstage_row_deleted_by_schema;
 count
-------
     0
(1 row)

Time: 481.131 ms
select count(*) from stage_row;
 count
-------
  2060
(1 row)

Time: 5.149 ms
select count(*) from stage_row_field;
 count
-------
  6400
(1 row)

Time: 753.301 ms
select count(*) from trackable_relation;
 count
-------
    52
(1 row)

Time: 20.232 ms
select count(*) from tracked_row;
 count
-------
  2075
(1 row)

Time: 14.473 ms
select count(*) from untracked_row;
 count
-------
     2
(1 row)

Time: 3233.124 ms (00:03.233)
select count(*) from untracked_row_by_relation;
 count
-------
     2
(1 row)

Time: 3266.709 ms (00:03.267)
select count(*) from untracked_row_by_schema;
 count
-------
     2
(1 row)

Time: 3253.589 ms (00:03.254)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic big meta-ticket that is more of an outcome than a specific task refactor
Projects
None yet
Development

No branches or pull requests

1 participant