Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(postgres): Remove SqlAlchemy dependency from postgres container #445

Conversation

jankatins
Copy link
Contributor

@jankatins jankatins commented Mar 6, 2024

Updates the pg testcontainer implementation to not use (and not install) SQLAlchemy nor psycopg2.

Closes: #340
Closes: #336
Closes: #320

@jankatins
Copy link
Contributor Author

jankatins commented Mar 6, 2024

Based on #340 by @logicbomb, I only rebased it and removed the dependency from pyproject-toml. Original PR description by @logicbomb:

I decided to include both the log check & the pg_isready check, as it seems they both need to be successful in order to enure the database is running.
I don't think the driver parameter belongs in Postgres.__init__ method, as it's only used when building the connection string, but I didn't want to introduce any breaking changes.

One outstanding review comment by @totallyzen:

Make it not be in generic.py. These "db containers" assume a certain URL format, but is it worth it to have an entire
class just for this? (I won't hide my distaste over DbContainer and why it is in core in the first place)

So what's the alternative (in the order of my preference, best to lowest)?

  • Move it to a different file?
  • Make all the db container implement ´_create_connection_url()`?
  • Add a different layer, like singledispatch to turn the container into an URL?

I didn't apply any changes in any of these directions yet, the current PR is just a rebase and small fix for the dependencies and newer PG versions in tests and examples.

@jankatins jankatins force-pushed the feature/remove-sqla-from-postgres-container branch from 0132a27 to 56f8081 Compare March 6, 2024 20:57
@jankatins
Copy link
Contributor Author

poetry run pytest -s --log-level=DEBUG modules/postgres passes locally :-)

@totallyzen
Copy link
Collaborator

So what's the alternative (in the order of my preference, best to lowest)?

Hm... I'm open to suggestions! 🤔

Let me think out aloud...

How I view the value of these containers... focus on

  • easy configuration
  • always start ready
  • give a connectivity URLs instead of clients (less enforced dependencies)

I think I'd like both of the first 2 options:

  • move to new module testcontainers.databases sounds contextually right
  • make the current db containers implement an abstract class that enforces some confidence it is hard to get wrong (like forgetting to override a method)

I am however happy if you only implement just one of those. I think you see where I'm coming from!

WDYT?

@jankatins
Copy link
Contributor Author

move to new module testcontainers.databases sounds contextually right

Does that module life under core/testcontainers/database or do you mean to add a complete subdirectory in the repo root? I would curerntly assume that it's core/testcontainers/database. That would basically be just a rename: currently it's called generic.py, now it would be a databases.py. I would currently prefer that, moving it to a extra module which is not DB specific looks trange given the current split into "modules are service specific code, core is everything which can be used by more than one module" (thats at leatst my first impression).

(I haven't understood why this split is needed, poetry build seems to "undo" it with all the includes, the only reason I can think of is that it made splitting the packages was easier but that's seems to be not needed anymore? for having the test "nearer" to the specific code?)

Another question_ do you consider these classes API or implementation detail? E.g. if I rename DbContainer to SQLAlchemyBasedDBContainer, is that a breakign change? (if it would be a breaking change, just moving them to another module would also be a breaking change, so I hope that these are implmentation details...)

@jankatins
Copy link
Contributor Author

jankatins commented Mar 6, 2024

Ok, one additional idea for the refactoring would be to just extract the sqla based _connect() (with the @wait_container_is_ready(*ADDITIONAL_TRANSIENT_ERRORS) around it) into its own file and a pure function and let every DB container which wants too implement their "wait for" check via SQLA just import it. Then we would have a simple single "DBContainer" without a dependency on SQLA, that one would have to implement the abstract _connect() (that seems to be the name used by most of the other services; _wait_for_ready()would probably be more appropriate...) and one would be done.

So the DBContainer would be:

from abc import ABC, abstractmethod
class DBContainer(ABC):
    @abstractmethod
    def _connect(self):
        """Check if the container is ready."""

    @abstractmethod
    def _configure(self):
        """Does basic setup so the container can start up (User, passwort, etc)."""

    @abstractmethod
    def get_connection_url(self, *, **kwargs): # psycopg v3 cannot handle sqla style URLs, so it needs a driver=None :-(
        """Get a DB connection URL which can be used to connect to the DB."""

    def _create_connection_url( ... as now ...):
        """Helper to create a DB API connection URL."""

(although he first two could actually be part of the DockerContainer interface... Most containers have that method already and the rest does that stuff in __init__).

Overall this is a BIG refactoring and it will touch probably half of the modules...

@totallyzen
Copy link
Collaborator

I haven't understood why this split is needed

I'm willing to hear "chill dude, it's fine". Please do call me out on my bias/pedanticism! I'm not averse to negative feedback. 🙂

The API level problem is what you are saying with _wait_for_ready() - this should be on all containers in fact, without fail.
I'm realising that maybe the "fix" to the API level problems with guarantees should live in it's own issue and merits it's own discussion, separate from this PR.

pure function and let every DB container which wants too implement their "wait for" check via SQLA just import it. Then we would have a simple single "DBContainer" without a dependency on SQLA, that one would have to implement the abstract _connect() (that seems to be the name used by most of the other services; _wait_for_ready()would probably be more appropriate...) and one would be done.

In terms of dev experience, I think you are onto something! It would cause the least amount of hassle and focus on meaningful fixes rather than my personal pedantic issues with generic.py. It can also lean on eventually fixing the underlying missing guarantee of _wait_for_ready() being implemented for all containers (even if it's just pass)

Final note on breaking changes -> we have semver for this exact reason!
I strongly believe it's okay to have breaking changes as long as you communicate how to upgrade and tell the story to your users.
If we give them a clear "upgrade path" that has very few steps, we should be fine!
The frequency of breaking changes will also lower over time as we iron out these issues.

I tried to condense my thoughts, I hope this helps!

@totallyzen
Copy link
Collaborator

To simplify your life for right now:

Overall this is a BIG refactoring and it will touch probably half of the modules...

Aye, we can take it in two steps:

  • introduce the dirty fix in THIS PR -> I'll approve, just to clear out the issue of "too many dependencies"
  • fix the _wait_for_ready() and _configure() abstract methods in a follow-up PR -> put it in 5.x

As for 5.x, we already have another important change that might break the API in the face of #314 which would put a time limit on container life (though I'm thinking how to implement this without breaking the API, if it's possible)

This was referenced Mar 7, 2024
@jankatins
Copy link
Contributor Author

jankatins commented Mar 7, 2024

I haven't understood why this split is needed
I'm willing to hear "chill dude, it's fine". Please do call me out on my bias/pedanticism! I'm not averse to negative feedback. 🙂

Just putting this here in case of misunderstandings: that was a honest question, I've never seen such a layout and it took me a bit to understand how that fits together in the final package.

From my understanding (and after the merge of all code into one package) it seems that all code could live in a single src/testcontainers/ folder if one accepts that there is a tests/ folder with the same (duplicated) hierarchy (so tests for src/testcontainers/postgres live in tests/postgres). Finding tests is a bit more scrolling (if your IDE does not let you jum to the tests) but packaging is probably a lot easier.

@jankatins
Copy link
Contributor Author

jankatins commented Mar 7, 2024

The dirty (and shortest) fix is to

  • Remove the DepedencyFreeDBContainer class
  • Overwrite _connect() in PG instead of _verify_status() WITHOUT calling super()

Because the current _connect() is importing SQLA only in the function, not calling super()._connect() would be enough to not trigger the import.

This would also be a completely backwards compatible change (apart from using testcontainers[postgres] and not getting sqla installed, which is the actual feature :-))

@jankatins jankatins force-pushed the feature/remove-sqla-from-postgres-container branch 3 times, most recently from 9180ea5 to e3272c1 Compare March 7, 2024 12:53
@jankatins
Copy link
Contributor Author

Implmented the "quick and fast" version above.

Tested if locally by poetry install --sync --extras=postgres (which deinstalled sqla and psycopg) and then running the tests poetry run pytest -s --log-level=DEBUG modules/postgres.

@jankatins jankatins force-pushed the feature/remove-sqla-from-postgres-container branch from e3272c1 to bf45a74 Compare March 7, 2024 13:14
@alexanderankin
Copy link
Member

alexanderankin commented Mar 7, 2024

this looks good to me. i understand there is a larger unfinished discussion about refactoring improvements - my thought is that if DbContainer was part of the public api (i would argue it is) then we cannot change it without version increases (maybe even major) so lets deprecate them now, and change the implementation underneath (overriding methods) (that is something we are allowed to do on patch increases). (i just mean in the sense that people should not use DbContainer going forward if we plan to remove it.)

@jankatins jankatins force-pushed the feature/remove-sqla-from-postgres-container branch 2 times, most recently from cfc3b27 to a45a991 Compare March 8, 2024 09:58
@jankatins
Copy link
Contributor Author

Added back the tests and also made a check against using sqla during container startup

@jankatins
Copy link
Contributor Author

jankatins commented Mar 8, 2024

(i just mean in the sense that people should not use DbContainer going forward if we plan to remove it.)

After this discussion (with adding _wait_for_ready() and _configure() to the top level class), DBContainer would only contain this:

from abc import ABC, abstractmethod
class DBContainer(ABC):

    @abstractmethod
    def get_connection_url(self, *, **kwargs): # psycopg v3 cannot handle sqla style URLs, so it needs a driver=None :-(
        """Get a DB connection URL which can be used to connect to the DB."""

    def _create_connection_url( ... as now ...):
        """Helper to create a DB API connection URL."""

I'm not sure if this is worth a extra class in the hierarchy, especially because it's unclear to me if that URL can be useful for all DB container or only the ones which are based on the python DB API (=RDBMS ones, not stuff like mongodb or redis): I've no clue if they also can be connected to with a single URL (kafka can not, I think).

Given that each container would need to implement the method anyway, one could also make _create_connection_url() a pure function and let each container just declare the method themselves...

Anyway: I would appreciate if this conversation could be moved to a separate issue :-)

@jankatins
Copy link
Contributor Author

Added the fix to make ruff happy, waiting for workflow approval @alexanderankin @totallyzen

@alexanderankin
Copy link
Member

ill keep an eye on this today but also we have a slack https://slack.testcontainers.org/ where i respond to pings and stuff

Remove test that was testing sqlalchemy support for driver types
Add tests for supported versions of Postgres
Modify the `get_connection_url` convenience method to support a
driverless url

Co-authored-by: Jan Katins <[email protected]>
Co-authored-by: Jason Turim <[email protected]>
@jankatins jankatins force-pushed the feature/remove-sqla-from-postgres-container branch from c5e2ccc to 69d60f3 Compare March 10, 2024 21:45
@jankatins
Copy link
Contributor Author

And another try.. rand black and ruff locally, so hopefully no more rounds... @alexanderankin

@alexanderankin alexanderankin merged commit f30eb1d into testcontainers:main Mar 10, 2024
12 checks passed
@alexanderankin
Copy link
Member

ok lets iterate on this:)

@alexanderankin alexanderankin changed the title feat(postgres): Remove SqlAlchemy dependency from postgres container fix(postgres): Remove SqlAlchemy dependency from postgres container Mar 10, 2024
@alexanderankin alexanderankin changed the title fix(postgres): Remove SqlAlchemy dependency from postgres container feat(postgres): Remove SqlAlchemy dependency from postgres container Mar 10, 2024
@jankatins jankatins deleted the feature/remove-sqla-from-postgres-container branch March 11, 2024 10:13
@alexanderankin alexanderankin changed the title feat(postgres): Remove SqlAlchemy dependency from postgres container fix(postgres): Remove SqlAlchemy dependency from postgres container Mar 11, 2024
alexanderankin pushed a commit that referenced this pull request Mar 11, 2024
…445)

Updates the pg testcontainer implementation to not use (and not install)
SQLAlchemy nor psycopg2.

Closes: #340
Closes: #336
Closes: #320

---------

Co-authored-by: Jason Turim <[email protected]>
alexanderankin added a commit that referenced this pull request Mar 11, 2024
🤖 I have created a release *beep* *boop*
---


##
[4.0.1](testcontainers-v4.0.0...testcontainers-v4.0.1)
(2024-03-11)


### Features

* **postgres:** Remove SqlAlchemy dependency from postgres container
([#445](#445))
([f30eb1d](f30eb1d))


### Bug Fixes

* **clickhouse:** clickhouse waiting
([#428](#428))
([902a5a3](902a5a3))
* Close docker client when stopping the docker container
([#380](#380))
([efb1683](efb1683))
* failing tests for elasticsearch on machines with ARM CPU
([#454](#454))
([701b23a](701b23a))
* **mongodb:** waiting for container to start (it was not waiting at all
before?)
([#461](#461))
([2c4f171](2c4f171))
* unclosed socket warning in db containers
([#378](#378))
([cd90aa7](cd90aa7))
* Update the copyright header for readthedocs
([#341](#341))
([5bef18a](5bef18a))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: David Ankin <[email protected]>
alexanderankin added a commit that referenced this pull request Mar 11, 2024
🤖 I have created a release *beep* *boop*
---


##
[4.0.1](testcontainers-v4.0.0...testcontainers-v4.0.1)
(2024-03-11)


### Features

* **postgres:** Remove SqlAlchemy dependency from postgres container
([#445](#445))
([f30eb1d](f30eb1d))


### Bug Fixes

* **clickhouse:** clickhouse waiting
([#428](#428))
([902a5a3](902a5a3))
* Close docker client when stopping the docker container
([#380](#380))
([efb1683](efb1683))
* failing tests for elasticsearch on machines with ARM CPU
([#454](#454))
([701b23a](701b23a))
* go back to 4.0.1
([#465](#465))
([1ac8c24](1ac8c24))
* **mongodb:** waiting for container to start (it was not waiting at all
before?)
([#461](#461))
([2c4f171](2c4f171))
* unclosed socket warning in db containers
([#378](#378))
([cd90aa7](cd90aa7))
* Update the copyright header for readthedocs
([#341](#341))
([5bef18a](5bef18a))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: David Ankin <[email protected]>
bearrito pushed a commit to bearrito/testcontainers-python that referenced this pull request Mar 30, 2024
…estcontainers#445)

Updates the pg testcontainer implementation to not use (and not install)
SQLAlchemy nor psycopg2.

Closes: testcontainers#340
Closes: testcontainers#336
Closes: testcontainers#320

---------

Co-authored-by: Jason Turim <[email protected]>
bearrito pushed a commit to bearrito/testcontainers-python that referenced this pull request Mar 30, 2024
🤖 I have created a release *beep* *boop*
---


##
[4.0.1](testcontainers/testcontainers-python@testcontainers-v4.0.0...testcontainers-v4.0.1)
(2024-03-11)


### Features

* **postgres:** Remove SqlAlchemy dependency from postgres container
([testcontainers#445](testcontainers#445))
([f30eb1d](testcontainers@f30eb1d))


### Bug Fixes

* **clickhouse:** clickhouse waiting
([testcontainers#428](testcontainers#428))
([902a5a3](testcontainers@902a5a3))
* Close docker client when stopping the docker container
([testcontainers#380](testcontainers#380))
([efb1683](testcontainers@efb1683))
* failing tests for elasticsearch on machines with ARM CPU
([testcontainers#454](testcontainers#454))
([701b23a](testcontainers@701b23a))
* **mongodb:** waiting for container to start (it was not waiting at all
before?)
([testcontainers#461](testcontainers#461))
([2c4f171](testcontainers@2c4f171))
* unclosed socket warning in db containers
([testcontainers#378](testcontainers#378))
([cd90aa7](testcontainers@cd90aa7))
* Update the copyright header for readthedocs
([testcontainers#341](testcontainers#341))
([5bef18a](testcontainers@5bef18a))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: David Ankin <[email protected]>
bearrito pushed a commit to bearrito/testcontainers-python that referenced this pull request Mar 30, 2024
🤖 I have created a release *beep* *boop*
---


##
[4.0.1](testcontainers/testcontainers-python@testcontainers-v4.0.0...testcontainers-v4.0.1)
(2024-03-11)


### Features

* **postgres:** Remove SqlAlchemy dependency from postgres container
([testcontainers#445](testcontainers#445))
([f30eb1d](testcontainers@f30eb1d))


### Bug Fixes

* **clickhouse:** clickhouse waiting
([testcontainers#428](testcontainers#428))
([902a5a3](testcontainers@902a5a3))
* Close docker client when stopping the docker container
([testcontainers#380](testcontainers#380))
([efb1683](testcontainers@efb1683))
* failing tests for elasticsearch on machines with ARM CPU
([testcontainers#454](testcontainers#454))
([701b23a](testcontainers@701b23a))
* go back to 4.0.1
([testcontainers#465](testcontainers#465))
([1ac8c24](testcontainers@1ac8c24))
* **mongodb:** waiting for container to start (it was not waiting at all
before?)
([testcontainers#461](testcontainers#461))
([2c4f171](testcontainers@2c4f171))
* unclosed socket warning in db containers
([testcontainers#378](testcontainers#378))
([cd90aa7](testcontainers@cd90aa7))
* Update the copyright header for readthedocs
([testcontainers#341](testcontainers#341))
([5bef18a](testcontainers@5bef18a))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: David Ankin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Design question / is a hard dependency on SqlAlchemy really necessary?
3 participants