sqlalchemy engine should be created once per process #27897

pedro-r-marques · 2024-04-04T11:38:56Z

Bug description

According to the SQL Alchemy documentation (https://docs.sqlalchemy.org/en/20/core/connections.html) users of the create_engine API should create the engine once per process. Superset is currently creating sqla Engine objects multiple times in superset/models/core.py:Database class since the database object is created on demand each the chart data is read.

The current approach makes it that the SqlAlchemy pool and connection management functionality doesn't work correctly. By allocating a single Engine object per database per process, superset can reuse all the Sql Alchemy functionality and follow the recommended usage of the API.

How to reproduce the bug

Use the DB_CONNECTION_MUTATOR hook in order to configure a SQLA connection pool (e.g. Queue with size 1)
Observe in the driver (e.g. duckdb SQL engine) that multiple connection to the database are created. This is a direct consequence of the fact that multiple engines are created. After patching the superset code to ensure that a single engine is created per URL (as per SQLA recommended usage) SQLA connection pools work as expected.

Screenshots/recordings

No response

Superset version

master / latest-dev

Python version

3.11

Node version

Not applicable

Browser

Not applicable

Additional context

No response

Checklist

I have searched Superset docs and Slack and didn't find a solution to my problem.
I have searched the GitHub issue tracker and didn't find a similar bug report.
I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

pedro-r-marques linked a pull request Apr 4, 2024 that will close this issue

fix(models.core:Database): Allocate a SQLA engine once per process per URL. #27899

Open

9 tasks

rusackas assigned betodealmeida, villebro and john-bodley Apr 4, 2024

michael-s-molina linked a pull request Apr 5, 2024 that will close this issue

fix(models.core:Database): Allocate a SQLA engine once per process per URL. #27899

Open

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sqlalchemy engine should be created once per process #27897

sqlalchemy engine should be created once per process #27897

pedro-r-marques commented Apr 4, 2024

sqlalchemy engine should be created once per process #27897

sqlalchemy engine should be created once per process #27897

Comments

pedro-r-marques commented Apr 4, 2024

Bug description

How to reproduce the bug

Screenshots/recordings

Superset version

Python version

Node version

Browser

Additional context

Checklist