Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated documentation and logos #164

Merged
merged 1 commit into from
Aug 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .assets/FAQs.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ The [Job Queue](https://learn.microsoft.com/en-us/dynamics365/business-central/a
We recommend that a data lake container holds data only for **only one** Business Central environment. After copying environments, ensure that the export destination on the setup page on the new environment points to a new data lake container.

### How do I export data from multiple companies in the same environment?
The export process copies the updated data to the lake for ONLY the company it has been invoked from. This is true whether you start the process by a click on the `Export` button or by scheduling a `Job Queue Entry`. Therefore, one should log in and click the button or setup scheduled jobs from the company whose data needs to be exported. A field called `Multi- company export` was added in [Pull Request #47](https://github.com/microsoft/bc2adls/pull/47) to improve concurrency for parallel exports from different companies. The field, in and of itself, does not export data from other companies.
The export process copies the updated data to the lake for ONLY the company it has been invoked from. This is true whether you start the process by a click on the `Export` button or by scheduling a `Job Queue Entry`. Therefore, one should log in and click the button or setup scheduled jobs from the company whose data needs to be exported. If you want to export the data from multiple companies the `export schema` must be done first.

### Can I export calculated fields into the lake?
No, only persistent fields on the BC tables can be exported. But, the [issue #88](/issues/88) describes a way to show up those fields when consuming the lake data.
Expand Down
19 changes: 18 additions & 1 deletion .assets/Setup.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,17 @@
The components for exporting to Azure Data Lake involved are the following,
- the **[businessCentral](/tree/main/businessCentral/)** folder holds a [BC extension](https://docs.microsoft.com/en-gb/dynamics365/business-central/ui-extensions) called `Azure Data Lake Storage Export` (ADLSE) which enables export of incremental data updates to a container on the data lake. The increments are stored in the CDM folder format described by the `deltas.cdm.manifest.json manifest`.
- the **[synapse](/tree/main/synapse/)** folder holds the templates needed to create an [Azure Synapse](https://azure.microsoft.com/en-gb/services/synapse-analytics/) pipeline that consolidates the increments into a final `data` CDM folder.

The following diagram illustrates the flow of data through a usage scenario- the main points being,
- Incremental update data from BC is moved to Azure Data Lake Storage through the ADLSE extension into the `deltas` folder.
- Triggering the Synapse pipeline(s) consolidates the increments into the data folder.
- The resulting data can be consumed by applications, such as Power BI, in the following ways:
- CDM: via the `data.cdm.manifest.json manifest`
- CSV/Parquet: via the underlying files for each individual entity inside the `data` folder
- Spark/SQL: via [shared metadata tables](/.assets/SharedMetadataTables.md)

![Architecture](/.assets/architecture.png "Flow of data")

The following steps take you through configuring your Dynamics 365 Business Central (BC) as well as Azure resources to enable the feature.

## Configuring the storage account
Expand Down Expand Up @@ -35,9 +49,12 @@ Let us take a look at the settings show in the sample screenshot below,
- **Client secret** The client credential key you had defined (refer to **c)** in the in the picture at [Step 1](/.assets/Setup.md#step-1-create-an-azure-service-principal))
- **Max payload size (MiBs)** The size of the individual data payload that constitutes a single REST Api upload operation to the data lake. A bigger size will surely mean less number of uploads but might consume too much memory on the BC side. Note that each upload creates a new block within the blob in the data lake. The size of such blocks are constrained as described at [Put Block (REST API) - Azure Storage | Microsoft Docs](https://docs.microsoft.com/en-us/rest/api/storageservices/put-block#remarks).
- **CDM data format** The format in which the exported data is stored on the data lake. Recommended format is Parquet, which is better at handling special characters in the BC text fields. Note that the `deltas` folder will always store files in the CSV format but the consolidated `data` folder will store files in the configured format.
- **Multi- company export** The flag to allow exporting data from multiple companies at the same time. You should enable this only after the export schema is finalized- in other words, ensure that at least one export for a company has been successful with all the desired tables and the desired fields in those tables. We recommend that the json files are manually checked in the outbound container before enabling this flag. Changes to the export schema (adding or removing tables as well as changing the field set to be exported) are not allowed as long as this flag is checked.
- **Skip row version sorting** Allows the records to be exported as they are fetched through SQL. This can be useful to avoid query timeouts when there is a large amount of records to be exported to the lake from a table, say, during the first export. The records are usually sorted ascending on their row version so that in case of a failure, the next export can re-start by exporting only those records that have a row version higher than that of the last exported one. This helps incremental updates to reach the lake in the same order that the updates were made. Enabling this check, however, may thus cause a subsequent export job to re-send records that had been exported to the lake already, thus leading to performance degradation on the next run. It is recommended to use this cautiously for only a few tables (while disabling export for all other tables), and disabling this check once all the data has been transferred to the lake.
- **Emit telemetry** The flag to enable or disable operational telemetry from this extension. It is set to True by default.
- **Translations** Choose the languages that you want to export the enum translations. You have to refresh this every time there is new translation added. This you can do to go to `Related` and then `Enum translations`.
- **Export Enum as Integer** The flag to enable or disable exporting the enum values as integers. It is set to False by default.
- **Add delivered DateTime** If you want the exported time in the CSV file yes or no.
- **Export Company Database Tables** Choose the company in which you want to export the DataPerCompany = false tables.

![The Export to Azure Data Lake Storage page](/.assets/bcAdlsePage.png)

Expand Down
20 changes: 17 additions & 3 deletions .assets/SetupFabric.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
The components for exporting to Microsoft Fabric involved are the following,
- the **[businessCentral](/tree/main/businessCentral/)** folder holds a [BC extension](https://docs.microsoft.com/en-gb/dynamics365/business-central/ui-extensions) called `Azure Data Lake Storage Export` (ADLSE) which enables export of incremental data updates to a container on the data lake. The increments are stored in the lakehouse folder in csv format.
- the **[fabric](/tree/main/fabric/)** folder holds the template needed to create an `notebook` to move the delta files to delta parquet table in the lakehouse.

The following diagram illustrates the flow of data through a usage scenario- the main points being,
- Incremental update data from BC is moved to Microsoft Fabric through the ADLSE extension into the `deltas` folder in the lakehouse.
- Triggering the notebook consolidates the increments into the delta parqeut tables.
- The resulting data can be consumed by Power BI or other tools inside Microsoft Fabric.:

![Architecture](/.assets/architectureFabric.png "Flow of data")

The following steps take you through configuring your Dynamics 365 Business Central (BC) as well as Azure resources to enable the feature.

## Configuring Azure
Expand Down Expand Up @@ -50,10 +61,13 @@ Once you have the `Azure Data Lake Storage Export` extension deployed, open the
Let us take a look at the settings show in the sample screenshot below,
- **Storage Type** choose here the storage type. Choose "Microsoft Fabric"
- **Tenant ID** The tenant id at which the app registration created above resides (refer to **b)** in the picture at [Step 1](/.assets/Setup.md#step-1-create-an-azure-service-principal))
- **Workspace** The workspace in your Microsoft Fabric environment where the lakehouse is located. This can also be a GUID.
- **Lakehouse** The name or GUID of the lakehouse inside the workspace.
- **Max payload size (MiBs)** The size of the individual data payload that constitutes a single REST Api upload operation to the data lake. A bigger size will surely mean less number of uploads but might consume too much memory on the BC side. Note that each upload creates a new block within the blob in the data lake. The size of such blocks are constrained as described at [Put Block (REST API) - Azure Storage | Microsoft Docs](https://docs.microsoft.com/en-us/rest/api/storageservices/put-block#remarks).
- **Workspace** The workspace in your Microsoft Fabric environment where the lakehouse is located. This can also be a GUID. Be aware that the workspace name cannot contain spaces.
- **Lakehouse** The name or GUID of the lakehouse inside the workspace. The same naming convention applies here as for the workspace.
- **Skip row version sorting** Allows the records to be exported as they are fetched through SQL. This can be useful to avoid query timeouts when there is a large amount of records to be exported to the lake from a table, say, during the first export. The records are usually sorted ascending on their row version so that in case of a failure, the next export can re-start by exporting only those records that have a row version higher than that of the last exported one. This helps incremental updates to reach the lake in the same order that the updates were made. Enabling this check, however, may thus cause a subsequent export job to re-send records that had been exported to the lake already, thus leading to performance degradation on the next run. It is recommended to use this cautiously for only a few tables (while disabling export for all other tables), and disabling this check once all the data has been transferred to the lake.
- **Emit telemetry** The flag to enable or disable operational telemetry from this extension. It is set to True by default.
- **Translations** Choose the languages that you want to export the enum translations. You have to refresh this every time there is new translation added. This you can do to go to `Related` and then `Enum translations`.
- **Export Enum as Integer** The flag to enable or disable exporting the enum values as integers. It is set to False by default.
- **Add delivered DateTime** If you want the exported time in the CSV file yes or no.
- **Export Company Database Tables** Choose the company in which you want to export the DataPerCompany = false tables.

![Business Central Fabric](/.assets/businessCentralFabric.png)
Binary file added .assets/architectureFabric.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified .assets/bc2adls_banner.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified .assets/bc2adls_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified .assets/bcAdlsePage.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified .assets/businessCentralFabric.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
19 changes: 2 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,10 @@
# Starting update
**The original repo of [bc2adls](https://github.com/microsoft/bc2adls) is in read-only mode. But since allot of partners are using this tool we want to develop it further as an open source. A special thanks to the creators of this tool: [Soumya Dutta](https://www.linkedin.com/in/soumya-dutta-07813a5/) and [Henri Schulte](https://www.linkedin.com/in/henrischulte/). Who put allot of effort in it.**

# Project

> **This tool is an <u>experiment</u> on Dynamics 365 Business Central with the sole purpose of discovering the possibilities of having data exported to an Azure Data Lake. To see the details of how this tool is supported, please visit [the Support page](./SUPPORT.md). In case you wish to use this tool for your next project and engage with us, you are welcome to create a issue or a pull request. As we are a small team, please expect delays in getting back to you.**

## Introduction

The **bc2adls** tool is used to export data from [Dynamics 365 Business Central](https://dynamics.microsoft.com/en-us/business-central/overview/) (BC) to [Azure Data Lake Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction) and expose it in the [CDM folder](https://docs.microsoft.com/en-us/common-data-model/data-lake) format. The components involved are the following,
- the **[businessCentral](/tree/main/businessCentral/)** folder holds a [BC extension](https://docs.microsoft.com/en-gb/dynamics365/business-central/ui-extensions) called `Azure Data Lake Storage Export` (ADLSE) which enables export of incremental data updates to a container on the data lake. The increments are stored in the CDM folder format described by the `deltas.cdm.manifest.json manifest`.
- the **[synapse](/tree/main/synapse/)** folder holds the templates needed to create an [Azure Synapse](https://azure.microsoft.com/en-gb/services/synapse-analytics/) pipeline that consolidates the increments into a final `data` CDM folder.

The following diagram illustrates the flow of data through a usage scenario- the main points being,
- Incremental update data from BC is moved to Azure Data Lake Storage through the ADLSE extension into the `deltas` folder.
- Triggering the Synapse pipeline(s) consolidates the increments into the data folder.
- The resulting data can be consumed by applications, such as Power BI, in the following ways:
- CDM: via the `data.cdm.manifest.json manifest`
- CSV/Parquet: via the underlying files for each individual entity inside the `data` folder
- Spark/SQL: via [shared metadata tables](/.assets/SharedMetadataTables.md)

![Architecture](/.assets/architecture.png "Flow of data")
The **bc2adls** tool is used to export incremental data from [Dynamics 365 Business Central](https://dynamics.microsoft.com/en-us/business-central/overview/) (BC) to [Microsoft Fabric ](https://learn.microsoft.com/nl-nl/fabric/get-started/microsoft-fabric-overview) or [Azure Data Lake Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction).


More details:
- [Installation and configuration of the connection with Azure Data Lake](/.assets/Setup.md)
Expand Down
4 changes: 2 additions & 2 deletions SUPPORT.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Support

Please be aware that bc2adls is an experimental prototype which is not supported by the Dynamics 365 Business Central product group in Microsoft. You are welcome to use and build upon bc2adls, however, you are doing so at your own risk.
Please be aware that bc2adls is an open source contribution which is not supported by the Dynamics 365 Business Central product group in Microsoft. You are welcome to use and build upon bc2adls, however, you are doing so at your own risk.
As we are only a small group, our support on this tool is very limited. Please expect delays in responding to your queries or bugs.
If the product group releases an officially supported data export feature, bc2adls will be deprecated. In that case, we will update this page to reflect this.

Expand All @@ -12,6 +12,6 @@ feature request as a new Issue. As this is a Github project, a self-service appr

For help and other relevant questions about using this project, please feel free to create a ticket or a pull request.

## Microsoft Support Policy
## Support Policy

Support for this project is limited to the resources listed above.
Loading