-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/mtchbx 45 #24
base: main
Are you sure you want to change the base?
Conversation
Args: | ||
file (BinaryIO): File to upload. | ||
bucket_name (str): Target S3 bucket. | ||
object_name (str): S3 object name. If not specified, file_name is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's file_name
?
class SourceItem(BaseModel): | ||
"""Response model for source""" | ||
|
||
schema: str | ||
table: str | ||
id: str | ||
resolution: str | None = None | ||
|
||
|
||
class Sources(BaseModel): | ||
"""Response model for sources""" | ||
|
||
sources: list[SourceItem] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we start moving this stuff to common?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See below. I think my favoured approach is separate Response*
classes for now?
class Sources(BaseModel): | ||
"""Response model for sources""" | ||
|
||
sources: list[SourceItem] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need a separate pydantic object? Can't we just return list[Sources]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sources
requires a warehouse, and therefore a valid engine, and it doesn't expose the resolution hash (though it could). Two solutions:
SourceBase
with the common fields, thenSource
andSourceResponse
as subclasses with connection-enabled fields in one, and the resolution hash in the other- Keep response models separate to common -- they need structures that aren't relevant to the common objects
) -> dict[str, SourceItem] | str: | ||
datasets = backend.datasets.list() | ||
for dataset in datasets: | ||
resolution = hexlify(dataset.resolution).decode("ascii") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use base64?
return {"source": result_obj} | ||
return "Source not found" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the source is not found we need to return a 404, and use a response with a proper schema, probably a codifier error message. And it is found, let's just return the object, no need to wrap it in a dict.
return "Source not found" | ||
|
||
|
||
@app.post("/sources/uploadFile") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
camelCase? Are we sure? We should also follow the endpoint convention we'd set
|
||
@app.post("/sources/uploadFile") | ||
async def add_source_to_s3( | ||
file: UploadFile, bucket_name: str = Form(...), object_name: str = Form(...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't allow specifying a bucket name, nor an object name (directly)
if is_file_uploaded: | ||
return "File was successfully uplaoded" | ||
return "File could not be uplaoded" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTTP status codes please, and Pydantic schemas for the response
|
||
@app.post("/sources/uploadFile") | ||
async def add_source_to_s3( | ||
file: UploadFile, bucket_name: str = Form(...), object_name: str = Form(...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a quick look at the docs, they're using annotations for form parameters
https://fastapi.tiangolo.com/tutorial/request-forms/
# def test_list_sources(): | ||
# response = client.get("/sources") | ||
# assert response.status_code == 200 | ||
@patch("matchbox.server.base.BackendManager.get_backend") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests will need to be updated to reflect requested changes in the API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per my comments
Context
We need to begin implementing endpoints for our API. This PR implements endpoints for
Source
objects.Changes proposed in this pull request
Guidance to review
The POST endpoint is based on the assumption that the client can send an entire file. This may not be a realistic assumption. For large files, the client will need to send chunks as multipart objects, which in turn will need to be sent as multiparts to s3. Unit tests have been written only for the two GET endpoints as the POST is more of an integration test
Checklist: