Skip to content

Commit

Permalink
add section to destination tables
Browse files Browse the repository at this point in the history
  • Loading branch information
sh-rp committed Jun 11, 2024
1 parent 5e4fc36 commit 11063b7
Showing 1 changed file with 35 additions and 1 deletion.
36 changes: 35 additions & 1 deletion docs/website/docs/general-usage/destination-tables.md
Original file line number Diff line number Diff line change
Expand Up @@ -303,4 +303,38 @@ load_info = pipeline.run(data, table_name="users")
Every time you run this pipeline, a new schema will be created in the destination database with a
datetime-based suffix. The data will be loaded into tables in this schema.
For example, the first time you run the pipeline, the schema will be named
`mydata_20230912064403`, the second time it will be named `mydata_20230912064407`, and so on.
`mydata_20230912064403`, the second time it will be named `mydata_20230912064407`, and so on.

## Loading data into existing tables not created by dlt

You can also load data from `dlt` into tables that already exist in the destination dataset and were not created by `dlt`.
There are a few things to keep in mind when you are doing this:

If you load data to a table that exists but does not contain any data, in most cases your load will succeed without problems.
`dlt` will create the needed columns and insert the incoming data. `dlt` will only be aware of columns that exist on the
discovered or provided internal schema, so if you have columns in your destination, that are not anticipated by `dlt`, they
will remain in the destination but stay unknown to `dlt`. This will generally not be a problem.

If your destination table already exists and contains columns that have the same name as columns discovered by `dlt` but
do not have matching datatypes, your load will fail and you will have to fix the column on the destination table first,
or change the column name in your incoming data to something else to avoid a collission.

If your destination table exists and already contains data, your load might also initially fail, since `dlt` creates
special `non-nullable` columns that contains required mandatory metadata. Some databases will not allow you to create
`non-nullable` columns on tables that have data, since the initial value for these columns of the existing rows can
not be inferred. You will have to manually create these columns with the correct type on your existing tables and
make them `nullable`, then fill in values for the existing rows. Some databases may allow you to create a new column
that is `non-nullable` and take a default value for existing rows in the same command. The columns you will need to
create are:

| name | type |
| --- | --- |
| _dlt_load_id | text/string/varchar |
| _dlt_id | text/string/varchar |

For child-tables you may also need to create:

| name | type |
| --- | --- |
| _dlt_parent_id | text/string/varchar |
| _dlt_root_id | text/string/varchar |

0 comments on commit 11063b7

Please sign in to comment.