diff --git a/source/documentation/tools/create-a-derived-table/style-guide.md b/source/documentation/tools/create-a-derived-table/style-guide.md index 6db12810..537891dd 100644 --- a/source/documentation/tools/create-a-derived-table/style-guide.md +++ b/source/documentation/tools/create-a-derived-table/style-guide.md @@ -31,13 +31,13 @@ Once you've established a clear style, stay consistent. This is the most importa ### Essential styling - 🔑 Each model should have a primary key and that primary key should be the first field in the table. - 🔑 The primary key of a model should be named `_id`, for example, `account_id`. This makes it easier to know what `id` is being referenced in downstream joined models. +- 🔑 Consistency is key! Use the same field names across models where possible. For example, a key to the `customers` table should be named `customer_id` rather than `user_id` or 'id'. This helps users to understand how tables can be joined together. - Use underscores (i.e. snake case) for naming dbt models; avoid dots or camel case. - ✅ `models_without_dots` - ❌ `models.with.dots` - ❌ `CamelCaseModels` - Most data platforms use dots to separate `database.schema.object`, so using underscores instead of dots reduces your need for [quoting](/reference/resource-properties/quoting). For more background, refer to [this GitHub issue](https://github.com/dbt-labs/dbt-core/issues/3246). - 🔑 Keys should be string data types. We advise using the hash function `dbt_utils.generate_surrogate_key` ([guidance here](https://github.com/dbt-labs/dbt-utils?tab=readme-ov-file#generate_surrogate_key-source)) to create unique keys - this ensures there is a unique id for each row, as well as making the ids uniform in length. Some advice (for example Kimball) suggests integer data type instead of string, but we feel this is outdated. -- 🔑 Consistency is key! Use the same field names across models where possible. For example, a key to the `customers` table should be named `customer_id` rather than `user_id` or 'id'. This helps users to understand how tables can be joined together. - ❌ Avoid reserved words as column names. create-a-derived-table reserved words, for example `order` (full list [here](https://docs.aws.amazon.com/athena/latest/ug/reserved-words.html)). - ➕ Booleans should be prefixed with `is_` or `has_`. - 🕰️ Timestamp columns should be named `_at`(for example, `created_at`) and should be in UTC. If a different timezone is used, this should be indicated with a suffix (`created_at_pt`). @@ -172,6 +172,8 @@ order by defendant_on_case_id ## 'Import' CTEs +For information on what CTEs are, see the [dbt docs](https://docs.getdbt.com/terms/cte). + - 🔝 All `{{ ref('...') }}` statements should be placed in CTEs at the top of the file. - 📦 'Import' CTEs should be named after the table they are referencing. - 'Import' CTEs should be short and concise to make it easy to read what tables are being read in. Try not to have lots of fields in your select statement as it makes reading the CTEs harder, `select *` should be sufficient in most cases.