Merge branch 'issue_3909' of https://github.com/hhunter-ms/docs into …

…issue_3909
dapr · Jul 8, 2024 · ba2c043 · ba2c043
2 parents 02087f2 + 8272c18
commit ba2c043
Show file tree

Hide file tree

Showing 12 changed files with 242 additions and 18 deletions.
diff --git a/.../developing-applications/building-blocks/workflow/workflow-features-concepts.md b/.../developing-applications/building-blocks/workflow/workflow-features-concepts.md
@@ -248,7 +248,7 @@ You can use the following two techniques to write workflows that may need to sch
 
 Because workflows are long-running and durable, updating workflow code must be done with extreme care. As discussed in the [workflow determinism]({{< ref "#workflow-determinism-and-code-restraints" >}}) limitation section, workflow code must be deterministic. Updates to workflow code must preserve this determinism if there are any non-completed workflow instances in the system. Otherwise, updates to workflow code can result in runtime failures the next time those workflows execute.
 
-[See known limitations]({{< ref "workflow-features-concepts.md#workflow-determinism-and-code-restraints" >}})
+[See known limitations]({{< ref "#limitations" >}})
 
 ## Workflow activities
 

diff --git a/daprdocs/content/en/operations/configuration/configuration-overview.md b/daprdocs/content/en/operations/configuration/configuration-overview.md
@@ -110,17 +110,30 @@ metrics:
   rules: []
   http:
     increasedCardinality: true
+    pathMatching:
+      - /items
+      - /orders/{orderID}
+      - /orders/{orderID}/items/{itemID}
+      - /payments/{paymentID}
+      - /payments/{paymentID}/status
+      - /payments/{paymentID}/refund
+      - /payments/{paymentID}/details
+    excludeVerbs: false
 ```
 
+In the examples above, the path filter `/orders/{orderID}/items/{itemID}` would return a single metric count matching all the `orderIDs` and all the `itemIDs`, rather than multiple metrics for each `itemID`. For more information, see [HTTP metrics path matching]({{< ref "metrics-overview.md#http-metrics-path-matching" >}}).
+
 The following table lists the properties for metrics:
 
 | Property     | Type   | Description |
 |--------------|--------|-------------|
 | `enabled` | boolean | When set to true, the default, enables metrics collection and the metrics endpoint. |
 | `rules`   | array | Named rule to filter metrics. Each rule contains a set of `labels` to filter on and a `regex` expression to apply to the metrics path. |
-| `http.increasedCardinality` | boolean | When set to true, in the Dapr HTTP server each request path causes the creation of a new "bucket" of metrics. This can cause issues, including excessive memory consumption, when there many different requested endpoints (such as when interacting with RESTful APIs).<br>In Dapr 1.13 the default value is `true` (to preserve the behavior of Dapr <= 1.12), but will change to `false` in Dapr 1.14. |
+| `http.increasedCardinality` | boolean | When set to `true` (default), in the Dapr HTTP server, each request path causes the creation of a new "bucket" of metrics. This can cause issues, including excessive memory consumption when there many different requested endpoints (such as when interacting with RESTful APIs).<br> To mitigate high memory usage and egress costs associated with [high cardinality metrics]({{< ref "metrics-overview.md#high-cardinality-metrics" >}}) with the HTTP server, you should set the `metrics.http.increasedCardinality` property to `false`.|
+| `http.pathMatching` | array | Paths used for path matching, allowing users to define matching paths in order to manage cardinality. |
+| `http.excludeVerbs` | boolean | When set to `true` (default is `false`), the Dapr HTTP server ignores each request HTTP verb when building the method metric label. |
 
-To mitigate high memory usage and egress costs associated with [high cardinality metrics]({{< ref "metrics-overview.md#high-cardinality-metrics" >}}) with the HTTP server, you should set the `metrics.http.increasedCardinality` property to `false`.
+To further help managing cardinality, path matching allows specified paths matched according to defined patterns, reducing the number of unique metrics paths and thus controlling metric cardinality. This feature is particularly useful for applications with dynamic URLs, ensuring that metrics remain meaningful and manageable without excessive memory consumption. 
 
 Using rules, you can set regular expressions for every metric exposed by the Dapr sidecar. For example:
 

diff --git a/daprdocs/content/en/operations/hosting/kubernetes/kubernetes-dapr-shared.md b/daprdocs/content/en/operations/hosting/kubernetes/kubernetes-dapr-shared.md
@@ -0,0 +1,87 @@
+---
+type: docs
+title: "Deploy Dapr per-node or per-cluster with Dapr Shared"
+linkTitle: "Dapr Shared"
+weight: 50000
+description: "Learn more about using Dapr Shared as an alternative deployment to sidecars"
+
+---
+
+Dapr automatically injects a sidecar to enable the Dapr APIs for your applications for the best availability and reliability. 
+
+Dapr Shared enables two alternative deployment strategies to create Dapr applications using a Kubernetes `Daemonset` for a per-node deployment or a `Deployment` for a per-cluster deployment. 
+
+- **`DaemonSet`:** When running Dapr Shared as a Kubernetes `DaemonSet` resource, the daprd container runs on each Kubernetes node in the cluster. This can reduce network hops between the applications and Dapr. 
+- **`Deployment`:** When running Dapr Shared as a Kubernetes `Deployment`,  the Kubernetes scheduler decides on which single node in the cluster the daprd container instance runs.
+
+{{% alert title="Dapr Shared deployments" color="primary" %}}
+For each Dapr application you deploy, you need to deploy the Dapr Shared Helm chart using different `shared.appId`s.
+{{% /alert %}}
+
+
+
+## Why Dapr Shared?
+
+By default, when Dapr is installed into a Kubernetes cluster, the Dapr control plane injects Dapr as a sidecar to applications annotated with Dapr annotations ( `dapr.io/enabled: "true"`). Sidecars offer many advantages, including improved resiliency, since there is an instance per application and all communication between the application and the sidecar happens without involving the network.
+
+
+<img src="/images/dapr-shared/sidecar.png" width=800 style="padding-bottom:15px;">
+
+While sidecars are Dapr's default deployment, some use cases require other approaches. Let's say you want to decouple the lifecycle of your workloads from the Dapr APIs. A typical example of this is functions, or function-as-a-service runtimes, which might automatically downscale your idle workloads to free up resources. For such cases, keeping the Dapr APIs and all the Dapr async functionalities (such as subscriptions) separate might be required. 
+
+Dapr Shared was created for these scenarios, extending the Dapr sidecar model with two new deployment approaches: `DaemonSet` (per-node) and `Deployment` (per-cluster).
+
+{{% alert title="Important" color="primary" %}}
+No matter which deployment approach you choose, it is important to understand that in most use cases, you have one instance of Dapr Shared (Helm release) per service (app-id). This means that if you have an application composed of three microservices, each service is recommended to have its own Dapr Shared instance. You can see this in action by trying the [Hello Kubernetes with Dapr Shared tutorial](https://github.com/dapr/dapr-shared/blob/main/docs/tutorial/README.md). 
+{{% /alert %}}
+
+
+### `DeamonSet`(Per-node)
+
+With Kubernetes `DaemonSet`, you can define applications that need to be deployed once per node in the cluster. This enables applications that are running on the same node to communicate with local Dapr APIs, no matter where the Kubernetes `Scheduler` schedules your workload.
+
+<img src="/images/dapr-shared/daemonset.png" width=800 style="padding-bottom:15px;">
+
+{{% alert title="Note" color="primary" %}}
+Since `DaemonSet` installs one instance per node, it consumes more resources in your cluster, compared to `Deployment` for a per cluster deployment, with the advantage of improved resiliency.
+{{% /alert %}}
+
+
+### `Deployment` (Per-cluster)
+
+Kubernetes `Deployments` are installed once per cluster. Based on available resources, the Kubernetes `Scheduler` decides on which node the workload is scheduled. For Dapr Shared, this means that your workload and the Dapr instance might be located on separate nodes, which can introduce considerable network latency with the trade-off of reduce resource usage.
+
+<img src="/images/dapr-shared/deployment.png" width=800 style="padding-bottom:15px;">
+
+## Getting Started with Dapr Shared
+
+{{% alert title="Prerequisites" color="primary" %}}
+Before installing Dapr Shared, make ensure you have [Dapr installed in your cluster]({{< ref "kubernetes-deploy.md" >}}).
+{{% /alert %}}
+
+If you want to get started with Dapr Shared, you can create a new Dapr Shared instance by installing the official Helm Chart:
+
+```
+helm install my-shared-instance oci://registry-1.docker.io/daprio/dapr-shared-chart --set shared.appId=<DAPR_APP_ID> --set shared.remoteURL=<REMOTE_URL> --set shared.remotePort=<REMOTE_PORT> --set shared.strategy=deployment
+```
+
+Your Dapr-enabled applications can now make use of the Dapr Shared instance by pointing the Dapr SDKs to or sending requests to the `my-shared-instance-dapr` Kubernetes service exposed by the Dapr Shared instance. 
+
+> The `my-shared-instance` above is the Helm Chart release name. 
+
+If you are using the Dapr SDKs, you can set the following environment variables for your application to connect to the Dapr Shared instance (in this case, running on the `default` namespace): 
+
+```
+        env:
+        - name: DAPR_HTTP_ENDPOINT
+          value: http://my-shared-instance-dapr.default.svc.cluster.local:3500
+        - name: DAPR_GRPC_ENDPOINT
+          value: http://my-shared-instance-dapr.default.svc.cluster.local:50001 
+```
+
+If you are not using the SDKs, you can send HTTP or gRPC requests to those endpoints. 
+
+## Next steps
+
+- Try the [Hello Kubernetes tutorial with Dapr Shared](https://github.com/dapr/dapr-shared/blob/main/docs/tutorial/README.md).
+- Read more in the [Dapr Shared repo](https://github.com/dapr/dapr-shared/blob/main/README.md)
diff --git a/daprdocs/content/en/operations/hosting/kubernetes/kubernetes-hybrid-clusters.md b/daprdocs/content/en/operations/hosting/kubernetes/kubernetes-hybrid-clusters.md
@@ -2,7 +2,7 @@
 type: docs
 title: "Deploy to hybrid Linux/Windows Kubernetes clusters"
 linkTitle: "Hybrid clusters"
-weight: 60000
+weight: 70000
 description: "How to run Dapr apps on Kubernetes clusters with Windows nodes"
 ---
 

diff --git a/daprdocs/content/en/operations/hosting/kubernetes/kubernetes-job.md b/daprdocs/content/en/operations/hosting/kubernetes/kubernetes-job.md
@@ -2,7 +2,7 @@
 type: docs
 title: "Running Dapr with a Kubernetes Job"
 linkTitle: "Kubernetes Jobs"
-weight: 70000
+weight: 80000
 description: "Use Dapr API in a Kubernetes Job context"
 ---
 

diff --git a/daprdocs/content/en/operations/hosting/kubernetes/kubernetes-volume-mounts.md b/daprdocs/content/en/operations/hosting/kubernetes/kubernetes-volume-mounts.md
@@ -2,7 +2,7 @@
 type: docs
 title: "How-to: Mount Pod volumes to the Dapr sidecar"
 linkTitle: "How-to: Mount Pod volumes"
-weight: 80000
+weight: 90000
 description: "Configure the Dapr sidecar to mount Pod Volumes"
 ---
 

diff --git a/daprdocs/content/en/operations/observability/metrics/metrics-overview.md b/daprdocs/content/en/operations/observability/metrics/metrics-overview.md
@@ -70,15 +70,135 @@ spec:
     enabled: false
 ```
 
-## High cardinality metrics
+## Optimizing HTTP metrics reporting with path matching
 
-When invoking Dapr using HTTP, the legacy behavior (and current default as of Dapr 1.13) is to create a separate "bucket" for each requested method. When working with RESTful APIs, this can cause very high cardinality, with potential negative impact on memory usage and CPU.
+When invoking Dapr using HTTP, metrics are created for each requested method by default. This can result in a high number of metrics, known as high cardinality, which can impact memory usage and CPU.
 
-Dapr 1.13 introduces a new option for the Dapr Configuration resource `spec.metrics.http.increasedCardinality`: when set to `false`, it reports metrics for the HTTP server for each "abstract" method (for example, requesting from a state store) instead of creating a "bucket" for each concrete request path.
+Path matching allows you to manage and control the cardinality of HTTP metrics in Dapr. This is an aggregation of metrics, so rather than having a metric for each event, you can reduce the number of metrics events and report an overall number.  For details on how to set the cardinality in configuration see ({{< ref "configuration-overview.md#metrics" >}})  
+
+This configuration is opt-in and is enabled via the Dapr configuration `spec.metrics.http.pathMatching`. When defined, it enables path matching, which standardizes specified paths for both metrics paths. This reduces the number of unique metrics paths, making metrics more manageable and reducing resource consumption in a controlled way.  
+
+When `spec.metrics.http.pathMatching` is combined with the `increasedCardinality` flag set to `false`, non-matched paths are transformed into a catch-all bucket to control and limit cardinality, preventing unbounded path growth. Conversely, when `increasedCardinality` is `true` (the default), non-matched paths are passed through as they normally would be, allowing for potentially higher cardinality but preserving the original path data. 
+
+### Examples of Path Matching in HTTP Metrics
+
+The following examples demonstrate how to use the Path Matching API in Dapr for managing HTTP metrics. On each example, the metrics are collected from 5 HTTP requests to the `/orders` endpoint with different order IDs. By adjusting cardinality and utilizing path matching, you can fine-tune metric granularity to balance detail and resource efficiency.
+
+These examples illustrate the cardinality of the metrics, highlighting that high cardinality configurations result in many entries, which correspond to higher memory usage for handling metrics. For simplicity, the following example focuses on a single metric: `dapr_http_server_request_count`. 
+
+#### Low cardinality with path matching (Recommendation)
+
+Configuration:
+```yaml
+http:
+  increasedCardinality: false
+  pathMatching:
+    - /orders/{orderID}
+```
+
+Metrics generated:
+```
+# matched paths
+dapr_http_server_request_count{app_id="order-service",method="GET",path="/orders/{orderID}",status="200"} 5
+# unmatched paths
+dapr_http_server_request_count{app_id="order-service",method="GET",path="",status="200"} 1
+```
+
+With low cardinality and path matching configured, you get the best of both worlds by grouping the metrics for the important endpoints without compromising the cardinality. This approach helps avoid high memory usage and potential security issues.
+
+#### Low cardinality without path matching
+
+Configuration:
+
+```yaml
+http:
+  increasedCardinality: false
+```
+Metrics generated:
+```
+dapr_http_server_request_count{app_id="order-service",method="GET", path="",status="200"} 5
+```
+
+In low cardinality mode, the path, which is the main source of unbounded cardinality, is dropped. This results in metrics that primarily indicate the number of requests made to the service for a given HTTP method, but without any information about the paths invoked. 
+
+
+#### High cardinality with path matching
+
+Configuration:
+```yaml
+http:
+  increasedCardinality: true
+  pathMatching:
+    - /orders/{orderID}
+```
+
+Metrics generated:
+```
+dapr_http_server_request_count{app_id="order-service",method="GET",path="/orders/{orderID}",status="200"} 5
+```
+
+This example results from the same HTTP requests as the example above, but with path matching configured for the path `/orders/{orderID}`. By using path matching, you achieve reduced cardinality by grouping the metrics based on the matched path.
+
+#### High Cardinality without path matching
+
+Configuration:
+```yaml
+http:
+  increasedCardinality: true
+```
+
+Metrics generated:
+```
+dapr_http_server_request_count{app_id="order-service",method="GET",path="/orders/1",status="200"} 1
+dapr_http_server_request_count{app_id="order-service",method="GET",path="/orders/2",status="200"} 1
+dapr_http_server_request_count{app_id="order-service",method="GET",path="/orders/3",status="200"} 1
+dapr_http_server_request_count{app_id="order-service",method="GET",path="/orders/4",status="200"} 1
+dapr_http_server_request_count{app_id="order-service",method="GET",path="/orders/5",status="200"} 1
+```
+
+For each request, a new metric is created with the request path. This process continues for every request made to a new order ID, resulting in unbounded cardinality since the IDs are ever-growing.
+
+
+### HTTP metrics exclude verbs
+
+The `excludeVerbs` option allows you to exclude specific HTTP verbs from being reported in the metrics. This can be useful in high-performance applications where memory savings are critical.
+
+### Examples of excluding HTTP verbs in metrics
+
+The following examples demonstrate how to exclude HTTP verbs in Dapr for managing HTTP metrics.
+
+#### Default - Include HTTP verbs
+
+Configuration:
+```yaml
+http:
+  excludeVerbs: false
+```
+
+Metrics generated:
+```
+dapr_http_server_request_count{app_id="order-service",method="GET",path="/orders",status="200"} 1
+dapr_http_server_request_count{app_id="order-service",method="POST",path="/orders",status="200"} 1
+```
+
+In this example, the HTTP method is included in the metrics, resulting in a separate metric for each request to the `/orders` endpoint.
+
+#### Exclude HTTP verbs
+
+Configuration:
+```yaml
+http:
+  excludeVerbs: true
+```
+
+Metrics generated:
+```
+dapr_http_server_request_count{app_id="order-service",method="",path="/orders",status="200"} 2
+```
+
+In this example, the HTTP method is excluded from the metrics, resulting in a single metric for all requests to the `/orders` endpoint.
 
-The default value of `spec.metrics.http.increasedCardinality` is `true` in Dapr 1.13, to maintain the same behavior as Dapr 1.12 and older. However, the value will change to `false` (low-cardinality metrics by default) in Dapr 1.14.
 
-Setting `spec.metrics.http.increasedCardinality` to `false` is **recommended** to all Dapr users, to reduce resource consumption. The pre-1.13 behavior, which is used when the option is `true`, is considered legacy and is only maintained for users who have special requirements around backwards-compatibility.
 
 ## Transform metrics with regular expressions
 

diff --git a/daprdocs/content/en/reference/resource-specs/configuration-schema.md b/daprdocs/content/en/reference/resource-specs/configuration-schema.md
@@ -38,6 +38,10 @@ spec:
             regex: {}
     http:
       increasedCardinality: <TRUE-OR-FALSE>
+      pathMatching: 
+        - <PATH-A>
+        - <PATH-B>
+      excludeVerbs: <TRUE-OR-FALSE>
   httpPipeline: # for incoming http calls
     handlers:
       - name: <HANDLER-NAME>

diff --git a/daprdocs/package-lock.json b/daprdocs/package-lock.json
diff --git a/daprdocs/static/images/dapr-shared/daemonset.png b/daprdocs/static/images/dapr-shared/daemonset.png
diff --git a/daprdocs/static/images/dapr-shared/deployment.png b/daprdocs/static/images/dapr-shared/deployment.png
diff --git a/daprdocs/static/images/dapr-shared/sidecar.png b/daprdocs/static/images/dapr-shared/sidecar.png