Skip to content

Commit

Permalink
README
Browse files Browse the repository at this point in the history
Signed-off-by: Bruno Campos <[email protected]>
  • Loading branch information
BfdCampos committed Aug 31, 2023
1 parent bb96a33 commit dfedbc3
Showing 1 changed file with 43 additions and 16 deletions.
59 changes: 43 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,56 @@
# dbt package: source_db
This is a dbt package that allows the user to specify where the dbt command should read the data from.
# dbt-source_db

## Use case
For when you have multiple environments/warehouses/accounts/schema's in your work and at times you want to WRITE to one place, but READ from another.
The `dbt-source_db` package allows you to specify the source database that dbt should read from. This enables reading from one database and writing to another.

dbt already comes with the WRITE solution with the `--target` by creating multiple profiles with the appropriate settings, and then specifying where you want your `run` to READ from and WRITE to. ([Choosing the right Snowflake warehouse when running dbt](https://about.gitlab.com/handbook/business-technology/data-team/platform/dbt-guide/#choosing-the-right-snowflake-warehouse-when-running-dbt)).
## Getting Started

However, for some cases you may want to READ from a particular warehouse, and WRITE to the warehouse you have specified in your `profile`. Say if you work with a "sandbox" environment before sending the PR that pulls the code into a production environment.
Install the package:

## Example use
Let's say you want to run a model in your dbt project called `my_model_a`.
```bash
dbt hub install dbt-labs/source_db
```

In your `dbt_project.yml` file, add the package:

```yml
packages:
- package: dbt-labs/source_db
```
## Usage
Set the `SRC_DB` environment variable to the source database you want dbt to read from:

```bash
export SRC_DB=dev_db
```

Then run dbt as usual. The `ref()` and `source()` macros will read from `SRC_DB` instead of the default target database.

Or you can set the variable within the same command.

In your project you have a *sandbox* environment where you are free to develop and try different solutions, and a *prd* environment where once the code has been looked over and approved, those changes get released to *prd*.
For example:

For you to develop in your *sandbox* environment you need to have the tables copied or cloned from *prd*. This can be easy to do if you only need one or two tables, but when you need multiple, this can become a pain.
```bash
SRC_DB=dev_db dbt run
```

The default behaviour of `dbt run --models +my_model_a` is to compile all the dbt code and READ all `ref`s and `source`s from the specified warehouse.
This will read all sources and refs from `dev_db`, but write to the database in your profile/target.

> So it compiles the code: `SELECT * FROM { ref('my_upstream_model') }` to `SELECT * FROM sandbox.schema.my_upstream_model`.
## Example

What if we want to develop __only__ our new model but with data from a particular environment like *dev*? With this package, you can run:
You have a _sandbox_ and _production_ database. You want to test a new model `my_model` in _sandbox_, but reading data from _production_.

Run the model:

```bash
SRC_DB=DEV dbt run --models +my_model_a
SRC_DB=prod_db dbt run --models my_model
```
What this will do, is it will compile` SELECT * FROM { ref('my_upstream_model') }` to `SELECT * FROM DEV.schema.my_upstream_model`, and it will write into the profile env as expected: `CREATE OR REPLACE TABLE sandbox.schema.my_model AS ( ... )`.

This can be really handy for when you need to test something locally without copying everyting one by one, all done directly from within your dbt project.
This will read from `prod_db` but write `my_model` to _sandbox_.

## Macro reference

- `ref(model_name)`: Reads `model_name` from `SRC_DB` instead of target database.

- `source(source_name, table_name)`: Reads `table_name` from `SRC_DB` instead of target database.

0 comments on commit dfedbc3

Please sign in to comment.