Skip to content

Commit

Permalink
disables index creation by default, updates docs
Browse files Browse the repository at this point in the history
  • Loading branch information
rudolfix committed Sep 12, 2023
1 parent c13988c commit 157dc90
Show file tree
Hide file tree
Showing 2 changed files with 47 additions and 3 deletions.
2 changes: 1 addition & 1 deletion dlt/destinations/mssql/configuration.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ class MsSqlClientConfiguration(DestinationClientDwhWithStagingConfiguration):
destination_name: Final[str] = "mssql" # type: ignore
credentials: MsSqlCredentials

create_indexes: bool = True
create_indexes: bool = False

def fingerprint(self) -> str:
"""Returns a fingerprint of host part of a connection string"""
Expand Down
48 changes: 46 additions & 2 deletions docs/website/docs/dlt-ecosystem/destinations/mssql.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,19 @@ This can't be included with `dlt`s python dependencies so you must installed it

See instructions here to [install Microsoft ODBC Driver 18 for SQL Server on Windows, Mac and Linux](https://learn.microsoft.com/en-us/sql/connect/odbc/download-odbc-driver-for-sql-server?view=sql-server-ver16)

Following ODBC drivers are supported:
* ODBC Driver 18 for SQL Server
* ODBC Driver 17 for SQL Server
[You configure driver name explicitly](#additional-destination-options) as well.

### Create a pipeline

**1. Initalize a project with a pipeline that loads to MS SQL by running**
```
dlt init chess mssql
```

**2. Install the necessary dependencies for BigQuery by running**
**2. Install the necessary dependencies for MS SQL by running**
```
pip install -r requirements.txt
```
Expand All @@ -45,12 +50,51 @@ port = 1433
connect_timeout = 15
```

You can also pass a SQLAlchemy-like database connection:
```toml
# keep it at the top of your toml file! before any section starts
destination.mssql.credentials="mssql://loader:<password>@loader.database.windows.net/dlt_data?connect_timeout=15"
```

To pass credentials directly you can use `credentials` argument passed to `dlt.pipeline` or `pipeline.run` methods.
```python
pipeline = dlt.pipeline(pipeline_name='chess', destination='postgres', dataset_name='chess_data', credentials="mssql://loader:<password>@loader.database.windows.net/dlt_data?connect_timeout=15")
```

## Write disposition
All write dispositions are supported

If you set the [`replace` strategy](../../general-usage/full-loading.md) to `staging-optimized` the destination tables will be dropped and
recreated with an `ALTER SCHEMA ... TRANSFER`. The operation is atomic: mssql supports DDL transactions.

## Data loading
Data is loaded via INSERT statements by default. MSSQL has a limit of 1000 rows per INSERT and this is what we use.

## Supported file formats
* [insert-values](../file-formats/insert-format.md) is used by default

## Supported column hints
**mssql** will create unique indexes for all columns with `unique` hints. This behavior **may be disabled**

## Syncing of `dlt` state
This destination fully supports [dlt state sync](../../general-usage/state#syncing-state-with-destination)

## Data types

MS SQL does not support JSON columns, so JSON objects are stored as strings in `nvarchar` column.

## Additional destination options
**mssql** destination **does not** creates UNIQUE indexes by default on columns with `unique` hint (ie. `_dlt_id`). To enable this behavior
```toml
[destination.mssql]
create_indexes=true
```

You can explicitly set the ODBC driver name:
```toml
[destination.mssql.credentials]
odbc_driver="ODBC Driver 18 for SQL Server"
```

### dbt support
No dbt support yet

0 comments on commit 157dc90

Please sign in to comment.