Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Update AWS Route53 documentation to better explain the authentication options #1555

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
185 changes: 165 additions & 20 deletions content/docs/configuration/acme/dns01/route53.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,15 @@ how cert-manager handles DNS01 challenges.
> (AWS) and that you already have a hosted zone in Route53.
>
> 📖 Read
> [Tutorial: Deploy cert-manager on Amazon Elastic Kubernetes (EKS) and use Let's Encrypt to sign a certificate for an HTTPS website](../../../tutorials/getting-started-aws-letsencrypt/README.md),
> which contains end-to-end instructions for those who are new to cert-manager and AWS.
> [Deploy cert-manager on AWS Elastic Kubernetes Service (EKS) and use Let's Encrypt to sign a certificate for an HTTPS website](../../../tutorials/getting-started-aws-letsencrypt/README.md),
> a tutorial which explains how to create an EKS cluster, install cert-manager, and configuring ACME ClusterIssuer using Route53 for DNS-01 challenges..

## Set up an IAM Role
## Set up an IAM Policy

cert-manager needs to be able to add records to Route53 in order to solve the
DNS01 challenge. To enable this, create a IAM policy with the following
permissions:
To solve a DNS01 challenge, cert-manager needs permission to add TXT records to your Route53 zone.
It needs permission to delete the TXT records, after the challenge succeeds or fails.
It also needs permission to list the hosted zones, unless you use the `hostedZoneID` field.
To enable this, create an IAM policy with the following permissions in the AWS account of the Route53 zone:

```json
{
Expand Down Expand Up @@ -47,30 +48,140 @@ permissions:
}
```

> Note: The `route53:ListHostedZonesByName` statement can be removed if you
> ℹ️ The `route53:ListHostedZonesByName` statement can be removed if you
> specify the (optional) `hostedZoneID`. You can further tighten the policy by
> limiting the hosted zone that cert-manager has access to (e.g.
> `arn:aws:route53:::hostedzone/DIKER8JEXAMPLE`).
>
> 📖 Read about [actions supported by Amazon Route 53](https://docs.aws.amazon.com/Route53/latest/APIReference/API_Operations_Amazon_Route_53.html),
> in the [Amazon Route 53 API Reference](https://docs.aws.amazon.com/Route53/latest/APIReference/Welcome.html).
>
> 📖 Learn how [`eksctl` can automatically create the cert-manager IAM policy](https://eksctl.io/usage/iam-policies/#cert-manager-policy), if you use EKS.

## Credentials

You have two options for the set up - either create a user or a role and attach
that policy from above. Using a role is considered best practice because you do
not have to store permanent credentials in a secret.
To solve a DNS01 challenge, cert-manager uses the Route53 API and
[requests to the Route53 API must be signed](https://docs.aws.amazon.com/Route53/latest/APIReference/requests-authentication.html)
using an access key ID and a secret access key.
There are several mechanisms for cert-manager to get the access key ID and secret access key,
some of which are now considered legacy.
The various mechanisms are described below in order of best practice.

### OIDC with dedicated Kubernetes ServiceAccount resources

In this scenario, cert-manager authenticates to Route53 using temporary credentials obtained from STS,
by exchanging a temporary Kubernetes ServiceAccount token for a temporary AWS access key.

| ✔️ Advantages | ✖️ Disadvantages |
|------------------------------------------|------------------------------------------------------------|
| Uses short-lived credentials | Additional Role and RoleBinding required |
| No key rotation necessary | Additional ServiceAccount required |
| No Kubernetes Secrets | Requires an AWS Identity Provider |
| **Works with any Kubernetes cluster** | Kubernetes cluster requires public OIDC discovery document |
| **Good isolation of Issuer permissions** | |

#### Mechanism

1. cert-manager connects to the Kubernetes API server and requests a [Kubernetes service account token](https://kubernetes.io/docs/concepts/security/service-accounts/#authenticating-credentials)
1. Kubernetes API server returns a signed JWT.
1. cert-manager connects to [AWS Security Token Service (STS)](https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html)
invoking the [AssumeRoleWithWebIdentity Action](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRoleWithWebIdentity.html),
supplying the Kubernetes signed JWT and the Role ARN specified in the ClusterIssuer (or Issuer).
1. STS returns a [Temporary security credential](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html)
(a short lived AWS access key).
1. cert-manager connects to [AWS Route53](https://docs.aws.amazon.com/Route53/latest/APIReference/API_Operations.html)
invoking various actions to create TXT records to satisfy the ACME challenge.
It [signs the requests](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_aws-signing.html) using the short lived access key.

#### Configuration

TODO

### IAM Roles for Service Accounts (IRSA) with mounted ServiceAccount token

In this scenario, cert-manager authenticates to Route53 using temporary credentials obtained from STS,
by exchanging a temporary Kubernetes ServiceAccount token for a temporary AWS access key.
The Kubernetes ServiceAccount token is mounted into the Pod of the cert-manager controller.

In AWS this mechanism is branded as [IAM Roles for Service Accounts (IRSA)](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html).
IRSA comprises:
1. The [Amazon EKS Pod Identity Webhook](https://github.com/aws/amazon-eks-pod-identity-webhook)
which runs in the cluster,
watches for Pods that use a service account with the following annotation: `eks.amazonaws.com/role-arn`, and
adds a [ServiceAccount token volume projection](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#serviceaccount-token-volume-projection) to the Pod and adds an environment variable to the Pod to allow the AWS SDK to discover and load the token.
2. A client using an [AWS SDK with default credential loader](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html),
which is able to discover the projected token created by the webhook, and
automatically load it and send it to STS invoking the AssumeRoleWithWebIdentity action.

| ✔️ Advantages | ✖️ Disadvantages |
|----------------------------------|-----------------------------------------------------|
| Uses short-lived credentials | Requires an AWS Identity Provider |
| No key rotation necessary | **Poor isolation of Issuer permissions** |
| No Kubernetes Secrets | **(Issuers can assume the Roles of other Issuers)** |
| **Works best with EKS clusters** | (requires the EKS Pod Identity Webhook) |

#### Mechanism

1. Kubernetes API server regularly generates a signed JWT which is mounted into the cert-manager controller Pod,
at file path which is stored in the environment variable: `AWS_WEB_IDENTITY_TOKEN_FILE`.
1. cert-manager reads the signed JWT from the file system.
1. cert-manager connects to [AWS Security Token Service (STS)](https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html)
invoking the [AssumeRoleWithWebIdentity Action](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRoleWithWebIdentity.html),
supplying the Kubernetes signed JWT and the Role ARN specified in the ClusterIssuer (or Issuer).
1. STS returns a [Temporary security credential](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html)
(a short lived AWS access key).
1. cert-manager connects to [AWS Route53](https://docs.aws.amazon.com/Route53/latest/APIReference/API_Operations.html)
invoking various actions to create TXT records to satisfy the ACME challenge.
It [signs the requests](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_aws-signing.html) using the short lived access key.


> 📖 Read [Configure the AWS Security Token Service endpoint for a service account](https://docs.aws.amazon.com/eks/latest/userguide/configure-sts-endpoint.html) if you need to change the

#### Configuration

TODO

### Load credentials from EC2 Instance Metadata Service (IMDS)

In this scenario, cert-manager authenticates to Route53 using temporary credentials [obtained from the EC2 Instance Metadata Service (IMDS)](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-metadata-security-credentials.html).
The temporary credentials correspond to the role specified in the [instance profile](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html).

| ✔️ Advantages | ✖️ Disadvantages |
|----------------------------------|-----------------------------------------------------|
| Uses short-lived credentials | |
| No key rotation necessary | **Poor isolation of Issuer permissions** |
| No Kubernetes Secrets | **(Issuers can assume the Roles of other Issuers)** |
| **Minimal configuration needed** | **Works ONLY on EKS clusters** |

#### Mechanism

If you enable debug logging you will see the following series of requests to the EC2 metadata service at `Host: 169.254.169.254`:

1. `PUT /latest/api/token`:
Get an authentication token.
1. `GET /latest/meta-data/iam/security-credentials/`:
Discover the instance profile Role name.
1. `GET /latest/meta-data/iam/security-credentials/<ROLE_NAME>`:
Get temporary access key for the instance profile Role.

Then you will see a series of requests to Route53:
1. `GET /2013-04-01/hostedzonesbyname?dnsname=<DNS_NAME>`: Discover the Hosted Zone for the DNS name.
1. `POST /2013-04-01/hostedzone/<HOSTED_ZONE_ID>/rrset`: Create a change set with the ACME challenge TXT record.
1. `GET /2013-04-01/change/<CHANGE_ID`: Check the status of the change set.

> ℹ️ If you **also** specify an Issuer `role`, cert-manager will additionally connect to STS and invoke the `AssumeRole` action,
> to get temporary credentials for the assumed role.
> It will then use those temporary credentials to authenticate with Route53.
>
> 📖 Read [Cross Account Access](#cross-account-access) (below) to learn about the required IAM configuration.

cert-manager supports two ways of specifying credentials:
### Load User access key from dedicated Secret

- explicit by providing an `accessKeyID` or an `accessKeyIDSecretRef`, and a `secretAccessKeySecretRef`
- or implicit (using [metadata
service](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html)
or [environment variables or credentials
file](https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html#specifying-credentials).
TODO

cert-manager also supports specifying a `role` to enable cross-account access
and/or limit the access of cert-manager. Integration with
[`kiam`](https://github.com/uswitch/kiam) and
[`kube2iam`](https://github.com/jtblin/kube2iam) should work out of the box.
### Load User access key from mounted Secret

TODO

## Cross Account Access

Expand Down Expand Up @@ -386,3 +497,37 @@ spec:
serviceAccountRef:
name: <service-account-name> # The name of the service account created
```


### Troubleshooting

#### Why is the Issuer `region` field required?

The Issuer `region` field should not be required.
cert-manager uses [AWS SDK for Go V2](https://aws.github.io/aws-sdk-go-v2/),
which will discover the region from the environment when running on AWS EKS or EC2.

[Route53 API is a global service with a single global endpoint](https://docs.aws.amazon.com/general/latest/gr/rande.html#global-endpoints), so the region should not be necessary.
[When you use the AWS SDK to submit requests, you can leave the Region and endpoint unspecified](https://docs.aws.amazon.com/general/latest/gr/r53.html#r53_region).

This is a legacy of the original Route53 integration, which was developed at a time when cert-manager was expected to be running on non AWS managed clusters where neither the ambient `AWS_REGION` environment variable nor the EC2 instance metadata service were expected to be available.
The maintainer are aware of this inconsistency and the validation will be relaxed in a future version of cert-manager.

> 📖 Read [Route53: document use of "region" field](https://github.com/cert-manager/website/issues/56).

#### Turn on Debug Logging

If you set the cert-manager controller log level to `>=4` you will see AWS requests and responses among the log messages.

> ℹ️ Available since cert-manager >=1.16
>
> 📖 Read about [`ClientLogMode` in AWS SDK for Go V2](https://aws.github.io/aws-sdk-go-v2/docs/configuring-sdk/logging/#clientlogmode).

#### Beware of Route53 quotas

[Amazon Route 53 API requests and entities are subject to the following quotas](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/DNSLimitations.html)

#### Use `AWS_` environment variables to override defaults

If necessary you can set any of the [Environment variables supported by the AWS SDK](https://docs.aws.amazon.com/sdkref/latest/guide/settings-reference.html#EVarSettings).
You can set these using the [`extraEnv` value in the Helm chart](https://artifacthub.io/packages/helm/cert-manager/cert-manager/#extraenv-~-array).