Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNM] Implement dialect "tidb+pymysql"/"tidb+mysqldb" for sqlalchemy #63

Closed
wants to merge 12 commits into from

Conversation

JaySon-Huang
Copy link
Contributor

@JaySon-Huang JaySon-Huang commented Oct 22, 2024

What's changed

  • Introduce class TiDBDialect_mysqldb/TiDBDialect_pymysql for connection string "tidb+mysqldb"/"tidb+pymysql"
    • These two dialect classes for tidb overwrite the ddl_compiler by TiDBDDLCompiler so we can generate the correct SQL for:
      • altering tiflash replica
      • add vector index
  • Introduce class VectorIndex and TiFlashReplica so that we can define them using the sqlalchemy framework
  • Introduce function get_declarative_base so that users can utilize it to define a table model with VectorIndex. See the examples/orm-sqlalchemy-quickstart/sqlalchemy-quickstart.py or tests/sqlalchemy/test_sqlalchemy.py for details.

And we can support other tidb-specified features like AUTO_RANDOM in the sqlalchemy-tidb framework

Local test result

>  TEST_TIDB_HOST=127.0.0.1 TEST_TIDB_PORT=8042 tox -e py39
py39: install_package> python -I -m pip install --force-reinstall --no-deps /Users/jayson/Projects/tidb-vector-python/.tox/.tmp/package/12/tidb_vector-0.0.0.tar.gz
py39: commands[0]> pytest tests
=================================================================================================================================== test session starts ===================================================================================================================================
platform darwin -- Python 3.9.6, pytest-8.3.3, pluggy-1.5.0
cachedir: .tox/py39/.pytest_cache
rootdir: /Users/jayson/Projects/tidb-vector-python
configfile: pyproject.toml
collected 40 items                                                                                                                                                                                                                                                                        

tests/integrations/test_utils.py .                                                                                                                                                                                                                                                  [  2%]
tests/integrations/test_vector_client.py ..........                                                                                                                                                                                                                                 [ 27%]
tests/peewee/test_peewee.py .............                                                                                                                                                                                                                                           [ 60%]
tests/sqlalchemy/test_sqlalchemy.py ................                                                                                                                                                                                                                                [100%]

==================================================================================================================================== warnings summary =====================================================================================================================================
tests/sqlalchemy/test_sqlalchemy.py::TestSQLAlchemy::test_insert_get_record
  /Users/jayson/Projects/tidb-vector-python/tests/sqlalchemy/test_sqlalchemy.py:57: SAWarning: Dialect tidb:pymysql will not make use of SQL compilation caching as it does not set the 'supports_statement_cache' attribute to ``True``.  This can have significant performance implications including some performance degradations in comparison to prior SQLAlchemy versions.  Dialect maintainers should seek to set this attribute to True after appropriate development and testing for SQLAlchemy 1.4 caching support.   Alternatively, this attribute may be set to False which will disable this warning. (Background on this warning at: https://sqlalche.me/e/20/cprf)
    session.query(Item1Model).delete()

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================================================================================================================= 40 passed, 1 warning in 55.69s ==============================================================================================================================
.pkg: _exit> python /Users/jayson/Projects/tidb-vector-python/lib/python3.13/site-packages/pyproject_api/_backend.py True poetry.core.masonry.api
  py39: OK (58.66=setup[2.77]+cmd[55.89] seconds)
  congratulations :) (58.72 seconds)

Inspired from

Signed-off-by: JaySon-Huang <[email protected]>
Signed-off-by: JaySon-Huang <[email protected]>
Signed-off-by: JaySon-Huang <[email protected]>
Signed-off-by: JaySon-Huang <[email protected]>
@JaySon-Huang JaySon-Huang changed the title [WIP] Implement dialect "tidb+mysqldb" [WIP] Implement dialect "tidb+mysqldb" for sqlalchemy Oct 23, 2024
Signed-off-by: JaySon-Huang <[email protected]>
@JaySon-Huang JaySon-Huang changed the title [WIP] Implement dialect "tidb+mysqldb" for sqlalchemy Implement dialect "tidb+pymysql"/"tidb+mysqldb" for sqlalchemy Oct 23, 2024
@JaySon-Huang
Copy link
Contributor Author

/cc @wd0517 @Mini256 @breezewish

Copy link
Member

@breezewish breezewish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very impressive improvement!! I like the way how user use it now.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@wd0517
Copy link
Collaborator

wd0517 commented Oct 23, 2024

Great job!
However, I think this should be merged to https://github.com/pingcap/sqlalchemy-tidb. We also need to maintain the sqlalchemy-tidb repository again and ensure it works for both sqlalchemy 1.4 and 2.0. Any suggestions?

BTW, a simplified and MySQL-compatible syntax for creating vector indexe would be more preferable.

@JaySon-Huang
Copy link
Contributor Author

BTW, a simplified and MySQL-compatible syntax for creating vector indexes would be more preferable.

Yes, we have further plan for making it more easier for creating vector indexes with some optional parameters in the SQL.

However, I think this should be merged to https://github.com/pingcap/sqlalchemy-tidb. We also need to maintain the sqlalchemy-tidb repository again and ensure it works for both sqlalchemy 1.4 and 2.0. Any suggestions?

Get it. I didn't notice there is such a lib there. I'll try to get the changes merged into sqlalchemy-tidb first.

wd0517 added a commit that referenced this pull request Nov 5, 2024
As an alternative to
#63

After discussion with @JaySon-Huang, inspired by some other vector
databases, I think we could simply introduce some side functions for
creating indexes. The usage is as simple as using native SQLAlchemy
interfaces, without the need to introduce another dialect like
sqlalchemy-tidb.

Maintaining a new SQLAlchemy dialect is costy, as we need to support
both SQLAlchemy 1.4 and SQLAlchemy 2.0, and support other TiDB features
like AUTO_RANDOM.

---------

Signed-off-by: Wish <[email protected]>
Co-authored-by: WD <[email protected]>
@JaySon-Huang JaySon-Huang changed the title Implement dialect "tidb+pymysql"/"tidb+mysqldb" for sqlalchemy [DNM] Implement dialect "tidb+pymysql"/"tidb+mysqldb" for sqlalchemy Nov 5, 2024
@JaySon-Huang
Copy link
Contributor Author

Only suite for sqlalchemy~2.0. Closing it because it take too much maintenance cost

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants