Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix generate_source example #164

Merged
merged 3 commits into from
Mar 7, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 64 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
Macros that generate dbt code, and log it to the command line.

# Contents

- [dbt-codegen](#dbt-codegen)
- [Contents](#contents)
- [Installation instructions](#installation-instructions)
Expand All @@ -27,56 +28,65 @@ Macros that generate dbt code, and log it to the command line.
- [Usage:](#usage-5)

# Installation instructions

New to dbt packages? Read more about them [here](https://docs.getdbt.com/docs/building-a-dbt-project/package-management/).

1. Include this package in your `packages.yml` file — check [here](https://hub.getdbt.com/dbt-labs/codegen/latest/) for the latest version number:

```yml
packages:
- package: dbt-labs/codegen
version: X.X.X ## update to latest version here
```

2. Run `dbt deps` to install the package.

# Macros

## generate_source ([source](macros/generate_source.sql))

This macro generates lightweight YAML for a [Source](https://docs.getdbt.com/docs/using-sources),
which you can then paste into a schema file.

### Arguments
* `schema_name` (required): The schema name that contains your source data
* `database_name` (optional, default=target.database): The database that your
source data is in.
* `table_names` (optional, default=none): A list of tables that you want to generate the source definitions for.
* `generate_columns` (optional, default=False): Whether you want to add the
column names to your source definition.
* `include_descriptions` (optional, default=False): Whether you want to add
description placeholders to your source definition.
* `include_data_types` (optional, default=True): Whether you want to add data
types to your source columns definitions.
* `table_pattern` (optional, default='%'): A table prefix / postfix that you
want to subselect from all available tables within a given schema.
* `exclude` (optional, default=''): A string you want to exclude from the selection criteria
* `name` (optional, default=schema_name): The name of your source
* `include_database` (optional, default=False): Whether you want to add
the database to your source definition
* `include_schema` (optional, default=False): Whether you want to add
the schema to your source definition

- `schema_name` (required): The schema name that contains your source data
- `database_name` (optional, default=target.database): The database that your
source data is in.
- `table_names` (optional, default=none): A list of tables that you want to generate the source definitions for.
- `generate_columns` (optional, default=False): Whether you want to add the
column names to your source definition.
- `include_descriptions` (optional, default=False): Whether you want to add
description placeholders to your source definition.
- `include_data_types` (optional, default=True): Whether you want to add data
types to your source columns definitions.
- `table_pattern` (optional, default='%'): A table prefix / postfix that you
want to subselect from all available tables within a given schema.
- `exclude` (optional, default=''): A string you want to exclude from the selection criteria
- `name` (optional, default=schema_name): The name of your source
- `include_database` (optional, default=False): Whether you want to add
the database to your source definition
- `include_schema` (optional, default=False): Whether you want to add
the schema to your source definition

### Outputting to a file

If you use the `dbt run-operation` approach it is possible to output directly to a file by piping the output to a new file and using the `--quiet` CLI flag:

```
dbt --quiet run-operation generate_model_yaml --args '{"model_name": "stg_jaffle_shop__orders"}' > models/staging/jaffle_shop/stg_jaffle_shop__orders.yml
dbt --quiet run-operation generate_source --args '{"table_names": ["orders"]}' > models/staging/jaffle_shop/_sources.yml
```

### Usage:

1. Copy the macro into a statement tab in the dbt Cloud IDE, or into an analysis file, and compile your code

```
{{ codegen.generate_source('raw_jaffle_shop') }}
```

or for multiple arguments
or for multiple arguments

```
{{ codegen.generate_source(schema_name= 'jaffle_shop', database_name= 'raw') }}
```
Expand Down Expand Up @@ -127,18 +137,20 @@ sources:
3. Paste the output in to a schema `.yml` file, and refactor as required.

## generate_base_model ([source](macros/generate_base_model.sql))

This macro generates the SQL for a base model, which you can then paste into a
model.

### Arguments:
* `source_name` (required): The source you wish to generate base model SQL for.
* `table_name` (required): The source table you wish to generate base model SQL for.
* `leading_commas` (optional, default=False): Whether you want your commas to be leading (vs trailing).
* `case_sensitive_cols ` (optional, default=False): Whether your source table has case sensitive column names. If true, keeps the case of the column names from the source.
* `materialized` (optional, default=None): Set materialization style (e.g. table, view, incremental) inside of the model's `config` block. If not set, materialization style will be controlled by `dbt_project.yml`

- `source_name` (required): The source you wish to generate base model SQL for.
- `table_name` (required): The source table you wish to generate base model SQL for.
- `leading_commas` (optional, default=False): Whether you want your commas to be leading (vs trailing).
- `case_sensitive_cols ` (optional, default=False): Whether your source table has case sensitive column names. If true, keeps the case of the column names from the source.
- `materialized` (optional, default=None): Set materialization style (e.g. table, view, incremental) inside of the model's `config` block. If not set, materialization style will be controlled by `dbt_project.yml`

### Usage:

1. Create a source for the table you wish to create a base model on top of.
2. Copy the macro into a statement tab in the dbt Cloud IDE, or into an analysis file, and compile your code

Expand Down Expand Up @@ -184,29 +196,38 @@ select * from renamed
4. Paste the output in to a model, and refactor as required.

## create_base_models ([source](macros/create_base_models.sql))

This macro generates a series of terminal commands (appended with the `&&` to allow for subsequent execution) that execute the [base_model_creation](#base_model_creation-source) bash script. This bash script will write the output of the [generate_base_model](#generate_base_model-source) macro into a new model file in your local dbt project.
>**Note**: This macro is not compatible with the dbt Cloud IDE.

> **Note**: This macro is not compatible with the dbt Cloud IDE.

### Arguments:
* `source_name` (required): The source you wish to generate base model SQL for.
* `tables` (required): A list of all tables you want to generate the base models for.

- `source_name` (required): The source you wish to generate base model SQL for.
- `tables` (required): A list of all tables you want to generate the base models for.

### Usage:

1. Create a source for the table you wish to create a base model on top of.
2. Copy the macro into a statement tab into your local IDE, and run your code

```sql
dbt run-operation codegen.create_base_models --args '{source_name: my-source, tables: ["this-table","that-table"]}'
```

## base_model_creation ([source](bash_scripts/base_model_creation.sh))

This bash script when executed from your local IDE will create model files in your dbt project instance that contain the outputs of the [generate_base_model](macros/generate_base_model.sql) macro.
>**Note**: This macro is not compatible with the dbt Cloud IDE.

> **Note**: This macro is not compatible with the dbt Cloud IDE.

### Arguments:
* `source_name` (required): The source you wish to generate base model SQL for.
* `tables` (required): A list of all tables you want to generate the base models for.

- `source_name` (required): The source you wish to generate base model SQL for.
- `tables` (required): A list of all tables you want to generate the base models for.

### Usage:

1. Create a source for the table you wish to create a base model on top of.
2. Copy the macro into a statement tab into your local IDE, and run your code

Expand All @@ -215,15 +236,18 @@ source dbt_packages/codegen/bash_scripts/base_model_creation.sh "source_name" ["
```

## generate_model_yaml ([source](macros/generate_model_yaml.sql))

This macro generates the YAML for a list of model(s), which you can then paste into a
schema.yml file.

### Arguments:
* `model_names` (required): The model(s) you wish to generate YAML for.
* `upstream_descriptions` (optional, default=False): Whether you want to include descriptions for identical column names from upstream models and sources.
* `include_data_types` (optional, default=True): Whether you want to add data types to your model column definitions.

- `model_names` (required): The model(s) you wish to generate YAML for.
- `upstream_descriptions` (optional, default=False): Whether you want to include descriptions for identical column names from upstream models and sources.
- `include_data_types` (optional, default=True): Whether you want to add data types to your model column definitions.

### Usage:

1. Create a model.
2. Copy the macro into a statement tab in the dbt Cloud IDE, or into an analysis file, and compile your code

Expand Down Expand Up @@ -268,13 +292,16 @@ models:
4. Paste the output in to a schema.yml file, and refactor as required.

## generate_model_import_ctes ([source](macros/generate_model_import_ctes.sql))

This macro generates the SQL for a given model with all references pulled up into import CTEs, which you can then paste back into the model.

### Arguments:
* `model_name` (required): The model you wish to generate SQL with import CTEs for.
* `leading_commas` (optional, default=False): Whether you want your commas to be leading (vs trailing).

- `model_name` (required): The model you wish to generate SQL with import CTEs for.
- `leading_commas` (optional, default=False): Whether you want your commas to be leading (vs trailing).

### Usage:

1. Create a model with your original SQL query
2. Copy the macro into a statement tab in the dbt Cloud IDE, or into an analysis file, and compile your code

Expand Down