Skip to content

Commit

Permalink
Update style-guide.md
Browse files Browse the repository at this point in the history
  • Loading branch information
hollyfurniss-moj authored Aug 6, 2024
1 parent 00fb9ac commit 1119fa2
Showing 1 changed file with 3 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,13 @@ Once you've established a clear style, stay consistent. This is the most importa
### Essential styling
- πŸ”‘ Each model should have a primary key and that primary key should be the first field in the table.
- πŸ”‘ The primary key of a model should be named `<object>_id`, for example, `account_id`. This makes it easier to know what `id` is being referenced in downstream joined models.
- πŸ”‘ Consistency is key! Use the same field names across models where possible. For example, a key to the `customers` table should be named `customer_id` rather than `user_id` or 'id'. This helps users to understand how tables can be joined together.
- Use underscores (i.e. snake case) for naming dbt models; avoid dots or camel case.
- βœ… `models_without_dots`
- ❌ `models.with.dots`
- ❌ `CamelCaseModels`
- Most data platforms use dots to separate `database.schema.object`, so using underscores instead of dots reduces your need for [quoting](/reference/resource-properties/quoting). For more background, refer to [this GitHub issue](https://github.com/dbt-labs/dbt-core/issues/3246).
- πŸ”‘ Keys should be string data types. We advise using the hash function `dbt_utils.generate_surrogate_key` ([guidance here](https://github.com/dbt-labs/dbt-utils?tab=readme-ov-file#generate_surrogate_key-source)) to create unique keys - this ensures there is a unique id for each row, as well as making the ids uniform in length. Some advice (for example Kimball) suggests integer data type instead of string, but we feel this is outdated.
- πŸ”‘ Consistency is key! Use the same field names across models where possible. For example, a key to the `customers` table should be named `customer_id` rather than `user_id` or 'id'. This helps users to understand how tables can be joined together.
- ❌ Avoid reserved words as column names. create-a-derived-table reserved words, for example `order` (full list [here](https://docs.aws.amazon.com/athena/latest/ug/reserved-words.html)).
- βž• Booleans should be prefixed with `is_` or `has_`.
- πŸ•°οΈ Timestamp columns should be named `<event>_at`(for example, `created_at`) and should be in UTC. If a different timezone is used, this should be indicated with a suffix (`created_at_pt`).
Expand Down Expand Up @@ -172,6 +172,8 @@ order by defendant_on_case_id

## 'Import' CTEs

For information on what CTEs are, see the [dbt docs](https://docs.getdbt.com/terms/cte).

- πŸ” All `{{ ref('...') }}` statements should be placed in CTEs at the top of the file.
- πŸ“¦ 'Import' CTEs should be named after the table they are referencing.
- 'Import' CTEs should be short and concise to make it easy to read what tables are being read in. Try not to have lots of fields in your select statement as it makes reading the CTEs harder, `select *` should be sufficient in most cases.
Expand Down

0 comments on commit 1119fa2

Please sign in to comment.