Layout config is not respected in `filesystem` destination when using an `sql_database` source #2107

trymzet · 2024-11-28T14:46:54Z

dlt version

1.4.0

Describe the problem

When using the Filesystem destination with eg. the following layout config:

[destination.filesystem]
bucket_url = ""
layout = "{schema_name}/{table_name}/{load_id}.{file_id}.{ext}"

The data is still loaded into {schema_name}/sql_database/{table_name}/{load_id}.{file_id}.{ext} (notice that a hardcoded sql_database directory is unexpectedly inserted by dlt).

Expected behavior

I think:

This name should be controllable (eg, if I have multiple SQL databases, I want to use specific db name instead of the generic "sql_database"). This might be part of the broader issue that currently, per-database configuration is not supported by dlt for sql_database sources (you either have one sql_database config or per-pipeline configs).
This behavior should be documented in the layout docs.
Or, the layout should be applied as specified in the config.

The text was updated successfully, but these errors were encountered:

sh-rp · 2024-12-11T09:19:38Z

We should have a look at this, but the destination is completely independent of the source. @trymzet are you sure that this does not happen if you use some other source?

trymzet · 2024-12-13T10:36:22Z

@sh-rp I'm using sql_table source now, and the layout produced is <pipeline_name>_dataset/{table_name}/{load_id}.{file_id}.{ext}. I guess schema_name is <pipeline_name>_dataset by default? Then this one would align with the layout.

BTW, for sql_database, it might be possible to control this directory's name like this #2114 (comment) (to be tested)

sh-rp · 2024-12-16T09:45:33Z

The first folder in the tree is your dataset name. You can set this with dataset_name="something" when constructing your pipeline. If you do not provide one, it will be generated from the pipeline name. You can read more about datasets and destinations in our docs to understand how this works. In your example in the first post, you have the dataset name at the beginning (you can't control this) then the "sql_database" part is the schema name that you manually inserted. If you don't want that, remove the schema_name part.

github-project-automation bot added this to dlt core library Nov 28, 2024

github-project-automation bot moved this to Todo in dlt core library Nov 28, 2024

trymzet closed this as completed Dec 3, 2024

github-project-automation bot moved this from Todo to Done in dlt core library Dec 3, 2024

trymzet reopened this Dec 3, 2024

sh-rp self-assigned this Dec 16, 2024

sh-rp added the question Further information is requested label Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Layout config is not respected in `filesystem` destination when using an `sql_database` source #2107

Layout config is not respected in `filesystem` destination when using an `sql_database` source #2107

trymzet commented Nov 28, 2024 •

edited

Loading

sh-rp commented Dec 11, 2024

trymzet commented Dec 13, 2024 •

edited

Loading

sh-rp commented Dec 16, 2024 •

edited

Loading

Layout config is not respected in filesystem destination when using an sql_database source #2107

Layout config is not respected in filesystem destination when using an sql_database source #2107

Comments

trymzet commented Nov 28, 2024 • edited Loading

dlt version

Describe the problem

Expected behavior

sh-rp commented Dec 11, 2024

trymzet commented Dec 13, 2024 • edited Loading

sh-rp commented Dec 16, 2024 • edited Loading

Layout config is not respected in `filesystem` destination when using an `sql_database` source #2107

Layout config is not respected in `filesystem` destination when using an `sql_database` source #2107

trymzet commented Nov 28, 2024 •

edited

Loading

trymzet commented Dec 13, 2024 •

edited

Loading

sh-rp commented Dec 16, 2024 •

edited

Loading