Implement first end-to-end client/server flow #23

lmazz1-dbt · 2024-12-06T14:02:06Z

Context

Changes proposed in this pull request

Added httpx to the project
Created _handler.py which implements the interface for all client functions/methods to talk to the API
Created common objects for the resolution graph
Created a standard and more compact way of representing hashes as strings
Implement the end-to-end flow for requesting the resolution graph in this new client-server world
Tested all of the above, including adding a dependency on VCRpy
Dockerised API, with dev mode (hot reload), and run it for testing locally and through GitHub workflow
Rationalised the way environment variables are set: now they are only loaded using dotenv, and read from a committed test.env

Guidance to review

Settled on HTTPX in the end rather than requests, as HTTPX comes with FastAPI[standard] anyway
I have done quite a lot of renaming of "models" to "resolutions" and similar, but I tried to do it only when writing new code or referencing new code. In this way I'm minimising the changes of this PR to stay in scope, and also for the moment keeping the "Model" entity name as an implementation detail of the adapter. Let's continue the refactor in a different ticket (this one potentially).
I've created a client folder. I think (though it is debatable) that it would be nice to have three top-level folders: client, server and common. However, this is useful even in a temporary capacity to separate the client functions we have migrated to this new client/server world
- Side note: There are a number of things in common that shouldn't be. Exceptions are an example - I think we need different exception types between client and server because as we disentangle those errors will be quite different. Let's again pick this up later.
Previously we added datasets as well as "models of type dataset" to the "model subgraph" returned as a rustworkx object. I thought that didn't actually fit within the resolution graph and I changed it. I also changed a bug in the previous logic constructing the "model subgraph" that would have inaccurately labelled e.g. human resolutions as "models". Finally, I changed the query to only return direct edges, not all closure table entries
To test _handler.py after having looked into a few options, I'm using VCR cassettes. Essentially, only the first time a test is run, is the API actually hit, and the result stored in a fixture. If the API endpoint changes, the fixture needs to be manually deleted. Once we write more complex tests of this kind, we'll need to work out what happens when the client request changes, but I believe it will also try to hit the API again in that case
- However, what happens if tests aren't run locally, and the first time a cassette is created is through the GitHub test pipeline? I conjecture that if the test was going to pass, it's still going to pass but without adding to the repo the fixture which I guess is fine? The fixture is going to be created the next time all tests are run (which PR authors are requested to do anyway, and check the checkbox in the PR template)

Checklist:

My code follows the style guidelines of this project
New and existing unit tests pass locally with my changes

wpfl-dbt

I love matchbox.common.graph. I've been thinking that we should describe the graph view that all backends represent views onto somewhere, because I think it's a useful conceptual introduction, and vital for anyone writing a backend. This is a great move in the direction of codifying the abstract model.

I trust you on VCR as a good solution for testing requests. I'm not going to look any further into it unless you think I should.

Why aren't you finishing the job with the Models* table name purge?

I'm not wholly convinced by test.env. It's also my experience that having sample.env and test.env be different is useful: sample.env can showcase features, whereas test.env has a job to do (see my sample one in the indexing branch).

docker-compose.yml

.github/workflows/pytest.yml

test/client/test_handler.py

src/matchbox/client/_handler.py

src/matchbox/common/hash.py

src/matchbox/server/Dockerfile

src/matchbox/server/postgresql/adapter.py

src/matchbox/server/postgresql/orm.py

test.env

wpfl-dbt

Happy based on our discussion that the full resolution rename happens in the next PR.

lmazz1-dbt added 24 commits December 5, 2024 11:44

Update gitignore

3d96e7b

Make implicit httpx dependency

54fae7c

Add env variable for API root

12af77d

Change API endpoint for resolutions

1196360

Add common graph entities

d726b6c

Create client handler and migrate one thing

7311204

Remove resolution drawing from helpers

1b53dfc

Create common hash conversion to string

1c6df17

Make server use common graph entities

b3b6514

Merge branch 'main' into feature/mb51

ef365a0

Fix server ResolutionGraph test

f089414

Implement and test API endpoint for resolution graph

6bc1704

Move resolution graph to fixture

0aacdbe

Test drawing resolution graph

3b41bd4

Slim down handler logic

c151f39

Create final failing tests for resolution graphs

7f811f3

Get graph isomorphism test to pass

8e050ee

Change client handler naming conventions

0b9b1da

Only include direct parents in resolution graph

a5d9983

Start testing client handler using VCR

251b6c8

Generalise vcr cassettes path

1f549d0

Remove API command in just test

e2f8f6a

Run API in GitHub test pipeline

e4f7b15

Include API output in GitHub tests

6d2033e

lmazz1-dbt marked this pull request as ready for review December 7, 2024 20:21

lmazz1-dbt added 3 commits December 8, 2024 12:46

Run API in docker for testing

323e835

Rationalise use of environment variables

c078df4

Use dockerised API for development

f9df7f8

wpfl-dbt reviewed Dec 9, 2024

View reviewed changes

Small PR review fixes

9ceec93

Improve env files structure

a08b80d

wpfl-dbt approved these changes Dec 11, 2024

View reviewed changes

lmazz1-dbt merged commit 2eac995 into main Dec 11, 2024
5 checks passed

lmazz1-dbt deleted the feature/mb51 branch December 11, 2024 16:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement first end-to-end client/server flow #23

Implement first end-to-end client/server flow #23

lmazz1-dbt commented Dec 6, 2024 •

edited

Loading

wpfl-dbt left a comment

wpfl-dbt left a comment

Implement first end-to-end client/server flow #23

Implement first end-to-end client/server flow #23

Conversation

lmazz1-dbt commented Dec 6, 2024 • edited Loading

Context

Changes proposed in this pull request

Guidance to review

Checklist:

wpfl-dbt left a comment

Choose a reason for hiding this comment

wpfl-dbt left a comment

Choose a reason for hiding this comment

lmazz1-dbt commented Dec 6, 2024 •

edited

Loading