-
Notifications
You must be signed in to change notification settings - Fork 186
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
credentials moved to configuration, added configuration pages
- Loading branch information
1 parent
1fb60d7
commit e33c2e5
Showing
36 changed files
with
499 additions
and
42 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
61 changes: 61 additions & 0 deletions
61
docs/website/docs/general-usage/configuration/config_providers.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
--- | ||
title: Secrets and Config Providers | ||
description: | ||
keywords: [credentials, secrets.toml, environment variables] | ||
--- | ||
|
||
## Providers | ||
If function signature has arguments that may be injected, `dlt` looks for the argument values in providers. **The argument name is a key in the lookup**. In case of `google_sheets()` it will look for: `tab_names`, `credentials` and `strings_only`. | ||
|
||
Each provider has its own key naming convention and dlt is able to translate between them. | ||
|
||
Providers form a hierarchy. At the top are environment variables, then `secrets.toml` and `config.toml` files. Providers like google, aws, azure vaults can be inserted after the environment provider. | ||
|
||
For example if `spreadsheet_id` is in environment, dlt does not look into other providers. | ||
|
||
The values passed in the code explitly are the **highest** in provider hierarchy. | ||
The default values of the arguments have the **lowest** priority in the provider hierarchy. | ||
|
||
> **Summary of the hierarchy** | ||
> explicit args > env variables > ...vaults, airflow etc > secrets.toml > config.toml > default arg values | ||
Secrets are handled only by the providers supporting them. Some of the providers support only secrets (to reduce the number of requests done by `dlt` when searching sections) | ||
1. `secrets.toml` and environment may hold both config and secret values | ||
2. `config.toml` may hold only config values, no secrets | ||
3. various vaults providers hold only secrets, `dlt` skips them when looking for values that are not secrets. | ||
|
||
⛔ Context aware providers will activate in right environments ie. on Airflow or AWS/GCP VMachines | ||
|
||
### Provider key formats. toml vs. environment variable | ||
|
||
Providers may use diffent formats for the keys. `dlt` will translate the standard format where sections and key names are separated by "." into the provider specific formats. | ||
|
||
1. for `toml` names are case sensitive and sections are separated with "." | ||
2. for environment variables all names are capitalized and sections are separated with double underscore "__" | ||
|
||
Example: | ||
When `dlt` evaluates the request `dlt.secrets["my_section.gcp_credentials"]` it must find the `private_key` for google credentials. It will look | ||
1. first in env variable `MY_SECTION__GCP_CREDENTIALS__PRIVATE_KEY` and if not found | ||
2. in `secrets.toml` with key `my_section.gcp_credentials.private_key` | ||
|
||
|
||
### Environment provider | ||
Looks for the values in the environment variables | ||
|
||
### Toml provider | ||
Tomls provider uses two `toml` files: `secrets.toml` to store secrets and `config.toml` to store configuration values. The default `.gitignore` file prevents secrets from being added to source control and pushed. The `config.toml` may be freely added. | ||
|
||
**Toml provider always loads those files from `.dlt` folder** which is looked **relative to the current working directory**. Example: | ||
if your working dir is `my_dlt_project` and you have: | ||
``` | ||
my_dlt_project: | ||
| | ||
pipelines/ | ||
|---- .dlt/secrets.toml | ||
|---- google_sheets.py | ||
``` | ||
in it and you run `python pipelines/google_sheets.py` then `dlt` will look for `secrets.toml` in `my_dlt_project/.dlt/secrets.toml` and ignore the existing `my_dlt_project/pipelines/.dlt/secrets.toml` | ||
|
||
if you change your working dir to `pipelines` and run `python google_sheets.py` it will look for `my_dlt_project/pipelines/.dlt/secrets.toml` a (probably) expected. | ||
|
||
*that was common problem on our workshop - but believe me all other layouts are even worse I've tried* |
107 changes: 107 additions & 0 deletions
107
docs/website/docs/general-usage/configuration/config_specs.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
--- | ||
title: Configuration specs | ||
description: | ||
keywords: [credentials, secrets.toml, environment variables] | ||
--- | ||
|
||
## Working with credentials (and other complex configuration values) | ||
|
||
`GcpClientCredentialsWithDefault` is an example of a **spec**: a Python `dataclass` that describes the configuration fields, their types and default values. It also allows to parse various native representations of the configuration. Credentials marked with `WithDefaults` mixin are also to instantiate itself from the machine/user default environment ie. googles `default()` or AWS `.aws/credentials`. | ||
|
||
As an example, let's use `ConnectionStringCredentials` which represents a database connection string. | ||
|
||
```python | ||
@dlt.source | ||
def query(sql: str, dsn: ConnectionStringCredentials = dlt.secrets.value): | ||
... | ||
``` | ||
|
||
The source above executes the `sql` against database defined in `dsn`. `ConnectionStringCredentials` makes sure you get the correct values with correct types and understands the relevant native form of the credentials. | ||
|
||
|
||
Example 1: use the dictionary form | ||
```toml | ||
[dsn] | ||
database="dlt_data" | ||
password="loader" | ||
username="loader" | ||
host="localhost" | ||
``` | ||
|
||
Example:2: use the native form | ||
```toml | ||
dsn="postgres://loader:loader@localhost:5432/dlt_data" | ||
``` | ||
|
||
Example 3: use mixed form: the password is missing in explicit dsn and will be taken from the `secrets.toml` | ||
```toml | ||
dsn.password="loader | ||
``` | ||
```python | ||
query("SELECT * FROM customers", "postgres://loader@localhost:5432/dlt_data") | ||
# or | ||
query("SELECT * FROM customers", {"database": "dlt_data", "username": "loader"...}) | ||
``` | ||
|
||
☮️ We will implement more credentials and let people reuse them when writing pipelines: | ||
- to represent oauth credentials | ||
- api key + api secret | ||
- AWS credentials | ||
|
||
|
||
### Working with alternatives of credentials (Union types) | ||
If your source/resource allows for many authentication methods you can support those seamlessly for your user. The user just passes the right credentials and `dlt` will inject the right type into your decorated function. | ||
|
||
Example: | ||
|
||
> read the whole [test](/tests/common/configuration/test_spec_union.py), it shows how to create unions of credentials that derive from the common class so you can handle it seamlessly in your code. | ||
```python | ||
@dlt.source | ||
def zen_source(credentials: Union[ZenApiKeyCredentials, ZenEmailCredentials, str] = dlt.secrets.value, some_option: bool = False): | ||
# depending on what the user provides in config, ZenApiKeyCredentials or ZenEmailCredentials will be injected in `credentials` argument | ||
# both classes implement `auth` so you can always call it | ||
credentials.auth() | ||
return dlt.resource([credentials], name="credentials") | ||
|
||
# pass native value | ||
os.environ["CREDENTIALS"] = "email:mx:pwd" | ||
assert list(zen_source())[0].email == "mx" | ||
|
||
# pass explicit native value | ||
assert list(zen_source("secret:🔑:secret"))[0].api_secret == "secret" | ||
|
||
# pass explicit dict | ||
assert list(zen_source(credentials={"email": "emx", "password": "pass"}))[0].email == "emx" | ||
|
||
``` | ||
> This applies not only to credentials but to all specs (see next chapter) | ||
## Writing own specs | ||
|
||
**specs** let you take full control over the function arguments: | ||
- which values should be injected, the types, default values. | ||
- you can specify optional and final fields | ||
- form hierarchical configurations (specs in specs). | ||
- provide own handlers for `on_error` or `on_resolved` | ||
- provide own native value parsers | ||
- provide own default credentials logic | ||
- adds all Python dataclass goodies to it | ||
- adds all Python `dict` goodies to it (`specs` instances can be created from dicts and serialized from dicts) | ||
|
||
This is used a lot in the `dlt` core and may become useful for complicated sources. | ||
|
||
In fact for each decorated function a spec is synthesized. In case of `google_sheets` following class is created. | ||
```python | ||
@configspec | ||
class GoogleSheetsConfiguration: | ||
tab_names: List[str] = None # manadatory | ||
credentials: GcpClientCredentialsWithDefault = None # mandatory secret | ||
only_strings: Optional[bool] = False | ||
``` | ||
|
||
> all specs derive from [BaseConfiguration](/dlt/common/configuration/specs//base_configuration.py) | ||
> all credentials derive from [CredentialsConfiguration](/dlt/common/configuration/specs//base_configuration.py) | ||
> Read the docstrings in the code above |
Oops, something went wrong.