From 5358ec8059ff0bfd5153edf09f48d650ba125f23 Mon Sep 17 00:00:00 2001 From: dat-a-man <98139823+dat-a-man@users.noreply.github.com> Date: Thu, 30 Nov 2023 06:36:50 +0000 Subject: [PATCH 1/7] ADded perosnio documentation. --- .../verified-sources/personio.md | 187 ++++++++++++++++++ docs/website/sidebars.js | 1 + 2 files changed, 188 insertions(+) create mode 100644 docs/website/docs/dlt-ecosystem/verified-sources/personio.md diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md new file mode 100644 index 0000000000..db3aa1f4e2 --- /dev/null +++ b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md @@ -0,0 +1,187 @@ +# Personio + +:::info Need help deploying these sources, or figuring out how to run them in your data stack? + +[Join our Slack community](https://join.slack.com/t/dlthub-community/shared_invite/zt-1n5193dbq-rCBmJ6p~ckpSFK4hCF2dYA) +or [book a call](https://calendar.app.google/kiLhuMsWKpZUpfho6) with our support engineer Adrian. +::: + +Personio is a human resources management software that helps businesses streamline HR processes, +including recruitment, employee data management, and payroll, in one platform. + +Our Personio verified source loads data using Perosnio API to your preferred +[destination](../destinations). + +:::tip You can check out our pipeline example +[here](https://github.com/dlt-hub/verified-sources/blob/master/sources/personio_pipeline.py). ::: + +Resources that can be loaded using this verified source are: + +| Name | Description | +|-------------|------------------------------------------------------------------------------------------| +| employees | Retrieves company employees details. (Employees list, absense_entitlement, cost_centers) | +| absences | Retrieves list of various types of employee absences | +| attendances | Retrieves attendance records for each employee | + +## Setup Guide + +### Grab credentials + +To load data from Personio, you need to API credentials, `client_id` and `client_secret`: + +1. Sign in to your Personio account, and ensure that your user account has API access rights. +1. Navigate to Settings > Integrations > API credentials. +1. Click on "Generate new credentials." +1. Assign necessary permissions to credentials, i.e. read access. + +### Initialize the verified source + +To get started with your data pipeline, follow these steps: + +1. Enter the following command: + + ```bash + dlt init personio duckdb + ``` + + [This command](../../reference/command-line-interface) will initialize + [the pipeline example](https://github.com/dlt-hub/verified-sources/blob/master/sources/personio_pipeline.py) + with Personio as the [source](../../general-usage/source) and [duckdb](../destinations/duckdb.md) + as the [destination](../destinations). + +1. If you'd like to use a different destination, simply replace `duckdb` with the name of your + preferred [destination](../destinations). + +1. After running this command, a new directory will be created with the necessary files and + configuration settings to get started. + +For more information, read the +[Walkthrough: Add a verified source.](../../walkthroughs/add-a-verified-source) + +### Add credentials + +1. In the `.dlt` folder, there's a file called `secrets.toml`. It's where you store sensitive + information securely, like access tokens. Keep this file safe. Here's its format for service + account authentication: + + ```toml + # Put your secret values and credentials here + # Note: Do not share this file and do not push it to GitHub! + [sources.personio] + client_id = "papi-********-****-****-****-************" # please set me up! + client_secret = "papi-************************************************" # please set me up! + ``` + +1. Replace the value of `client_id` and `client_secret` with the one that + [you copied above](#grab-credentials). This will ensure that your data-verified source can access + your Personio API resources securely. + +1. Next, follow the instructions in [Destinations](../destinations/duckdb) to add credentials for + your chosen destination. This will ensure that your data is properly routed to its final + destination. + +For more information, read the [General Usage: Credentials.](../../general-usage/credentials) + +## Run the pipeline + +1. Before running the pipeline, ensure that you have installed all the necessary dependencies by + running the command: + ```bash + pip install -r requirements.txt + ``` +1. You're now ready to run the pipeline! To get started, run the following command: + ```bash + python personio_pipeline.py + ``` +1. Once the pipeline has finished running, you can verify that everything loaded correctly by using + the following command: + ```bash + dlt pipeline show + ``` + For example, the `pipeline_name` for the above pipeline example is `personio`, you may also use + any custom name instead. + +For more information, read the [Walkthrough: Run a pipeline.](../../walkthroughs/run-a-pipeline) + +## Sources and resources + +`dlt` works on the principle of [sources](../../general-usage/source) and +[resources](../../general-usage/resource). + +### Source `personio_source` + +This function initializes class `PersonioAPI` in "personio/helpers.py" and returns data resources +like "employees", "absences", and "attendances". + +```python +@dlt.source(name="personio") +def personio_source( + client_id: str = dlt.secrets.value, + client_secret: str = dlt.secrets.value, + items_per_page: int = DEFAULT_ITEMS_PER_PAGE, +) -> Iterable[DltResource]: +``` + +`client_id`: Generated ID for API access. + +`client_secret`: Generated secret for API access. + +`items_per_page`: Maximum number of items per page, defaults to 200. + +### Resource `employees` + +This resource retrieves data on all the employees in a company. + +```python + @dlt.resource(primary_key="id", write_disposition="merge") + def employees( + updated_at: dlt.sources.incremental[ + pendulum.DateTime + ] = dlt.sources.incremental( + "last_modified_at", initial_value=None, allow_external_schedulers=True + ), + items_per_page: int = items_per_page, + ) -> Iterable[TDataItem]: +``` + +`updated_at`: The saved state of the last 'last_modified_at' value. It is used for +[incremental loading](../../general-usage/incremental-loading). + +`items_per_page`: Maximum number of items per page, defaults to 200. + +Like the `employees` resource discussed above, other resources `absences` and `attendances` load +data from the Personio API to your preferred destination. + +## Customization + +### Create your own pipeline + +If you wish to create your own pipelines, you can leverage source and resource methods from this +verified source. + +1. Configure the pipeline by specifying the pipeline name, destination, and dataset as follows: + + ```python + pipeline = dlt.pipeline( + pipeline_name="personio", # Use a custom name if desired + destination="duckdb", # Choose the appropriate destination (e.g., duckdb, redshift, post) + dataset_name="personio_data" # Use a custom name if desired + ) + ``` + + :::note To read more about pipeline configuration, please refer to our + [documentation](../../general-usage/pipeline). ::: + +1. To load employee data: + + ```python + load_data = personio_source().with_resources("employees") + print(pipeline.run(load_data)) + ``` + +1. To load data from all supported endpoints: + + ```python + load_data = personio_source().with_resources("employees", "absences", "attendances") + print(pipeline.run(load_data)) + ``` diff --git a/docs/website/sidebars.js b/docs/website/sidebars.js index 71be6ccaa8..3d731d64f6 100644 --- a/docs/website/sidebars.js +++ b/docs/website/sidebars.js @@ -52,6 +52,7 @@ const sidebars = { 'dlt-ecosystem/verified-sources/mongodb', 'dlt-ecosystem/verified-sources/mux', 'dlt-ecosystem/verified-sources/notion', + 'dlt-ecosystem/verified-sources/personio', 'dlt-ecosystem/verified-sources/pipedrive', 'dlt-ecosystem/verified-sources/salesforce', 'dlt-ecosystem/verified-sources/shopify', From 6f812d8494e9f8e5ecb05540897965337a636458 Mon Sep 17 00:00:00 2001 From: dat-a-man <98139823+dat-a-man@users.noreply.github.com> Date: Thu, 30 Nov 2023 06:42:07 +0000 Subject: [PATCH 2/7] Update --- docs/website/docs/dlt-ecosystem/verified-sources/personio.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md index db3aa1f4e2..ea4388f704 100644 --- a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md +++ b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md @@ -13,7 +13,8 @@ Our Personio verified source loads data using Perosnio API to your preferred [destination](../destinations). :::tip You can check out our pipeline example -[here](https://github.com/dlt-hub/verified-sources/blob/master/sources/personio_pipeline.py). ::: +[here](https://github.com/dlt-hub/verified-sources/blob/master/sources/personio_pipeline.py). +::: Resources that can be loaded using this verified source are: From 2fa42c16248b7b979a8048eea5f3607051b8d27f Mon Sep 17 00:00:00 2001 From: dat-a-man <98139823+dat-a-man@users.noreply.github.com> Date: Thu, 30 Nov 2023 06:45:43 +0000 Subject: [PATCH 3/7] =?UTF-8?q?Updated=C2=A0tip=20and=20info.?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../docs/dlt-ecosystem/verified-sources/personio.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md index ea4388f704..9f5bcd2290 100644 --- a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md +++ b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md @@ -12,8 +12,8 @@ including recruitment, employee data management, and payroll, in one platform. Our Personio verified source loads data using Perosnio API to your preferred [destination](../destinations). -:::tip You can check out our pipeline example -[here](https://github.com/dlt-hub/verified-sources/blob/master/sources/personio_pipeline.py). +:::tip +You can check out our pipeline example [here](https://github.com/dlt-hub/verified-sources/blob/master/sources/personio_pipeline.py). ::: Resources that can be loaded using this verified source are: @@ -170,8 +170,9 @@ verified source. ) ``` - :::note To read more about pipeline configuration, please refer to our - [documentation](../../general-usage/pipeline). ::: + :::note + To read more about pipeline configuration, please refer to our [documentation](../../general-usage/pipeline). + ::: 1. To load employee data: From 45745a1c23451b5942177b669ebad50e9bf07442 Mon Sep 17 00:00:00 2001 From: dat-a-man <98139823+dat-a-man@users.noreply.github.com> Date: Thu, 30 Nov 2023 06:50:53 +0000 Subject: [PATCH 4/7] Updated secrets.toml --- docs/website/docs/dlt-ecosystem/verified-sources/personio.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md index 9f5bcd2290..4a46c5292c 100644 --- a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md +++ b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md @@ -69,8 +69,8 @@ For more information, read the # Put your secret values and credentials here # Note: Do not share this file and do not push it to GitHub! [sources.personio] - client_id = "papi-********-****-****-****-************" # please set me up! - client_secret = "papi-************************************************" # please set me up! + client_id = "papi-*****" # please set me up! + client_secret = "papi-*****" # please set me up! ``` 1. Replace the value of `client_id` and `client_secret` with the one that From 72f99e0dad0063f7c25301b1ac8505df8159a2ed Mon Sep 17 00:00:00 2001 From: dat-a-man <98139823+dat-a-man@users.noreply.github.com> Date: Sat, 2 Dec 2023 10:03:23 +0000 Subject: [PATCH 5/7] updated for comments. --- .../verified-sources/personio.md | 26 +++++++++---------- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md index 4a46c5292c..692e34aa8e 100644 --- a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md +++ b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md @@ -9,7 +9,7 @@ or [book a call](https://calendar.app.google/kiLhuMsWKpZUpfho6) with our support Personio is a human resources management software that helps businesses streamline HR processes, including recruitment, employee data management, and payroll, in one platform. -Our Personio verified source loads data using Perosnio API to your preferred +Our [Personio verified](https://github.com/dlt-hub/verified-sources/blob/master/sources/personio) source loads data using Perosnio API to your preferred [destination](../destinations). :::tip @@ -28,13 +28,17 @@ Resources that can be loaded using this verified source are: ### Grab credentials -To load data from Personio, you need to API credentials, `client_id` and `client_secret`: +To load data from Personio, you need to obtain API credentials, `client_id` and `client_secret`: 1. Sign in to your Personio account, and ensure that your user account has API access rights. 1. Navigate to Settings > Integrations > API credentials. 1. Click on "Generate new credentials." 1. Assign necessary permissions to credentials, i.e. read access. +:::info +The Personio UI, which is described here, might change. The full guide is available at this [link.](https://developer.personio.de/docs#21-employee-attendance-and-absence-endpoints) +::: + ### Initialize the verified source To get started with your data pipeline, follow these steps: @@ -56,8 +60,7 @@ To get started with your data pipeline, follow these steps: 1. After running this command, a new directory will be created with the necessary files and configuration settings to get started. -For more information, read the -[Walkthrough: Add a verified source.](../../walkthroughs/add-a-verified-source) +For more information, read [add a verified source.](../../walkthroughs/add-a-verified-source) ### Add credentials @@ -81,7 +84,7 @@ For more information, read the your chosen destination. This will ensure that your data is properly routed to its final destination. -For more information, read the [General Usage: Credentials.](../../general-usage/credentials) +For more information, read [credentials](../../general-usage/credentials). ## Run the pipeline @@ -102,7 +105,7 @@ For more information, read the [General Usage: Credentials.](../../general-usage For example, the `pipeline_name` for the above pipeline example is `personio`, you may also use any custom name instead. -For more information, read the [Walkthrough: Run a pipeline.](../../walkthroughs/run-a-pipeline) +For more information, read [run a pipeline.](../../walkthroughs/run-a-pipeline) ## Sources and resources @@ -111,8 +114,7 @@ For more information, read the [Walkthrough: Run a pipeline.](../../walkthroughs ### Source `personio_source` -This function initializes class `PersonioAPI` in "personio/helpers.py" and returns data resources -like "employees", "absences", and "attendances". +This `dlt` source returns data resources like "employees", "absences", and "attendances". ```python @dlt.source(name="personio") @@ -160,7 +162,7 @@ data from the Personio API to your preferred destination. If you wish to create your own pipelines, you can leverage source and resource methods from this verified source. -1. Configure the pipeline by specifying the pipeline name, destination, and dataset as follows: +1. Configure the [pipeline](../../general-usage/pipeline) by specifying the pipeline name, destination, and dataset as follows: ```python pipeline = dlt.pipeline( @@ -170,10 +172,6 @@ verified source. ) ``` - :::note - To read more about pipeline configuration, please refer to our [documentation](../../general-usage/pipeline). - ::: - 1. To load employee data: ```python @@ -184,6 +182,6 @@ verified source. 1. To load data from all supported endpoints: ```python - load_data = personio_source().with_resources("employees", "absences", "attendances") + load_data = personio_source() print(pipeline.run(load_data)) ``` From 1220c9addcf7872335678a25fc423db9b342ca2d Mon Sep 17 00:00:00 2001 From: dat-a-man <98139823+dat-a-man@users.noreply.github.com> Date: Mon, 4 Dec 2023 06:07:09 +0000 Subject: [PATCH 6/7] Updated title, description and keywords. --- .../website/docs/dlt-ecosystem/verified-sources/personio.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md index 692e34aa8e..28bfe20ddd 100644 --- a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md +++ b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md @@ -1,3 +1,9 @@ +--- +title: Personio +description: dlt verified source for Personio API +keywords: [personio api, personio verified source, personio] +--- + # Personio :::info Need help deploying these sources, or figuring out how to run them in your data stack? From e8f7fc520eb072fdd0117e6758ca8981fa560d7f Mon Sep 17 00:00:00 2001 From: AstrakhantsevaAA Date: Mon, 4 Dec 2023 11:55:53 +0100 Subject: [PATCH 7/7] fix docs titles --- .../verified-sources/personio.md | 26 +++++++++---------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md index 28bfe20ddd..c348b7089a 100644 --- a/docs/website/docs/dlt-ecosystem/verified-sources/personio.md +++ b/docs/website/docs/dlt-ecosystem/verified-sources/personio.md @@ -18,8 +18,8 @@ including recruitment, employee data management, and payroll, in one platform. Our [Personio verified](https://github.com/dlt-hub/verified-sources/blob/master/sources/personio) source loads data using Perosnio API to your preferred [destination](../destinations). -:::tip -You can check out our pipeline example [here](https://github.com/dlt-hub/verified-sources/blob/master/sources/personio_pipeline.py). +:::tip +You can check out our pipeline example [here](https://github.com/dlt-hub/verified-sources/blob/master/sources/personio_pipeline.py). ::: Resources that can be loaded using this verified source are: @@ -66,7 +66,7 @@ To get started with your data pipeline, follow these steps: 1. After running this command, a new directory will be created with the necessary files and configuration settings to get started. -For more information, read [add a verified source.](../../walkthroughs/add-a-verified-source) +For more information, read [Add a verified source.](../../walkthroughs/add-a-verified-source) ### Add credentials @@ -90,7 +90,7 @@ For more information, read [add a verified source.](../../walkthroughs/add-a-ver your chosen destination. This will ensure that your data is properly routed to its final destination. -For more information, read [credentials](../../general-usage/credentials). +For more information, read [Credentials](../../general-usage/credentials). ## Run the pipeline @@ -111,7 +111,7 @@ For more information, read [credentials](../../general-usage/credentials). For example, the `pipeline_name` for the above pipeline example is `personio`, you may also use any custom name instead. -For more information, read [run a pipeline.](../../walkthroughs/run-a-pipeline) +For more information, read [Run a pipeline.](../../walkthroughs/run-a-pipeline) ## Sources and resources @@ -143,14 +143,14 @@ This resource retrieves data on all the employees in a company. ```python @dlt.resource(primary_key="id", write_disposition="merge") - def employees( - updated_at: dlt.sources.incremental[ - pendulum.DateTime - ] = dlt.sources.incremental( - "last_modified_at", initial_value=None, allow_external_schedulers=True - ), - items_per_page: int = items_per_page, - ) -> Iterable[TDataItem]: +def employees( + updated_at: dlt.sources.incremental[ + pendulum.DateTime + ] = dlt.sources.incremental( + "last_modified_at", initial_value=None, allow_external_schedulers=True + ), + items_per_page: int = items_per_page, +) -> Iterable[TDataItem]: ``` `updated_at`: The saved state of the last 'last_modified_at' value. It is used for