diff --git a/.gitignore b/.gitignore index aee6996..50e4b93 100644 --- a/.gitignore +++ b/.gitignore @@ -6,4 +6,9 @@ infra-*/terraform.tfstate infra-*/terraform.tfstate* infra-*/.terraform* infra-*/secrets.auto.tfvars +*kubeconfig +*terraform.tfstate* +*terraform.lock.* +.terraform +*secrets.auto.tfvars my-notes diff --git a/README.md b/README.md index 08a528e..aacc81a 100644 --- a/README.md +++ b/README.md @@ -25,7 +25,7 @@ In particular, this project aims to provide the following benefits to Open edX o ## Technology stack and architecture 1. At the base is a Kubernetes cluster, which you must provide (e.g. using Terraform to provision Amazon EKS). - * Any cloud provider such as AWS or Digital Ocean should work. There is an example Terraform setup in `infra-example` but it is just a starting point and not recommended for production use. + * Any cloud provider such as AWS or Digital Ocean should work. There are Terraform examples in the `infra-examples` folder but it is just a starting point and not recommended for production use. 2. On top of that, this project's helm chart will install the shared resources you need - an ingress controller, monitoring, database clusters, etc. The following are included but can be disabled/replaced if you prefer an alternative: * Ingress controller: [ingress-nginx](https://kubernetes.github.io/ingress-nginx/) * Automatic HTTPS cert provisioning: [cert-manager](https://cert-manager.io/) @@ -89,6 +89,75 @@ still present in your cluster. [pod-autoscaling plugin](https://github.com/eduNEXT/tutor-contrib-pod-autoscaling) enables the implementation of HPA and VPA to start scaling an installation workloads. Variables for the plugin configuration are documented there. +#### Node-autoscaling with Karpenter in EKS Clusters. + +This section provides a guide on how to install and configure [Karpenter](https://karpenter.sh/) in a EKS cluster. We'll use +infrastructure examples included in this repo for such purposes. + +> Prerequisites: + - An aws accound id + - Kubectl 1.27 + - Terraform 1.5.x or higher + - Helm + +1. Clone this repository and navigate to `./infra-examples/aws`. You'll find Terraform modules for `vpc` and `k8s-cluster` +resources. Proceed creating the `vpc` resources first, followed by the `k8s-cluster` resources. Make sure to have the target +AWS account ID available, and then execute the following commands on every folder: + + ``` + terraform init + terraform plan + terraform apply -auto-approve + ``` + + It will create an EKS cluster in the new VPC. Required Karpenter resources will also be created. + +2. Once the `k8s-cluster` is created, run the `terraform output` command on that module and copy the following output variables: + + - cluster_name + - karpenter_irsa_role_arn + - karpenter_instance_profile_name + + These variables will be required in the next steps. + +3. Karpenter is a dependency of the harmony chart that can be enabled or disabled. To include Karpenter in the Harmony Chart, +**it is crucial** to configure these variables in your `values.yaml` file: + + - `karpenter.enabled`: true + - `karpenter.serviceAccount.annotations.eks\.amazonaws\.com/role-arn`: "<`karpenter_irsa_role_arn` value from module>" + - `karpenter.settings.aws.defaultInstanceProfile`: "<`karpenter_instance_profile_name` value from module>" + - `karpenter.settings.aws.clusterName`: "<`cluster_name` value from module>" + + Find below an example of the Karpenter section in the `values.yaml` file: + + ``` + karpenter: + enabled: true + serviceAccount: + annotations: + eks.amazonaws.com/role-arn: "" + settings: + aws: + # -- Cluster name. + clusterName: "" + ``` + +4. Now, install the Harmony Chart in the new EKS cluster using [these instructions](#usage-instructions). This will provide a +very basic Karpenter configuration with one [provisioner](https://karpenter.sh/docs/concepts/provisioners/) and one +[node template](https://karpenter.sh/docs/concepts/node-templates/). Please refer to the official documentation to +get further details. + +> **NOTE:** +> This Karpenter installation does not support multiple provisioners or node templates for now. + +5. To test Karpenter, you can proceed with the instructions included in the +[official documentation](https://karpenter.sh/docs/getting-started/getting-started-with-karpenter/#first-use).


@@ -238,18 +307,46 @@ Just run `helm uninstall --namespace harmony harmony` to uninstall this. ### How to create a cluster for testing on DigitalOcean If you use DigitalOcean, you can use Terraform to quickly spin up a cluster, try this out, then shut it down again. -Here's how. First, put the following into `infra-tests/secrets.auto.tfvars` including a valid DigitalOcean access token: +Here's how. First, put the following into `infra-examples/secrets.auto.tfvars` including a valid DigitalOcean access token: ``` cluster_name = "harmony-test" do_token = "digital-ocean-token" ``` Then run: ``` -cd infra-example +cd infra-examples/digitalocean terraform init terraform apply cd .. -export KUBECONFIG=`pwd`/infra-example/kubeconfig +export KUBECONFIG=`pwd`/infra-examples/kubeconfig ``` Then follow steps 1-4 above. When you're done, run `terraform destroy` to clean up everything. + +## Appendix C: how to create a cluster for testing on AWS + +Similarly, if you use AWS, you can use Terraform to spin up a cluster, try this out, then shut it down again. +Here's how. First, put the following into `infra-examples/aws/vpc/secrets.auto.tfvars` and `infra-examples/aws/k8s-cluster/secrets.auto.tfvars`: + + ```terraform + account_id = "012345678912" + aws_region = "us-east-1" + name = "tutor-multi-test" + ``` + +Then run: + + ```bash + aws sts get-caller-identity # to verify that awscli is properly configured + cd infra-examples/aws/vpc + terraform init + terraform apply # run time is approximately 1 minute + cd ../k8s-cluster + terraform init + terraform apply # run time is approximately 30 minutes + + # to configure kubectl + aws eks --region us-east-1 update-kubeconfig --name tutor-multi-test --alias tutor-multi-test + ``` + +Then follow steps 1-4 above. When you're done, run `terraform destroy` in both the `aws` and `k8s-cluster` modules to clean up everything. diff --git a/charts/harmony-chart/Chart.lock b/charts/harmony-chart/Chart.lock index 4bf4ac8..6be0d35 100644 --- a/charts/harmony-chart/Chart.lock +++ b/charts/harmony-chart/Chart.lock @@ -17,5 +17,8 @@ dependencies: - name: opensearch repository: https://opensearch-project.github.io/helm-charts version: 2.13.3 -digest: sha256:11b69b1ea771337b1e7cf8497ee342a25b095b86899b8cee716be8cc9f955559 -generated: "2023-07-01T19:23:29.18815+03:00" +- name: karpenter + repository: oci://public.ecr.aws/karpenter + version: v0.29.2 +digest: sha256:453b9f734e2d770948d3cbd36529d98da284b96de051581ea8d11a3c05e7a78e +generated: "2023-10-03T10:52:43.453442762-05:00" diff --git a/charts/harmony-chart/Chart.yaml b/charts/harmony-chart/Chart.yaml index 2d34fe0..e4fd749 100644 --- a/charts/harmony-chart/Chart.yaml +++ b/charts/harmony-chart/Chart.yaml @@ -5,7 +5,7 @@ type: application # This is the chart version. This version number should be incremented each time you make changes to the chart and its # templates, including the app version. # Versions are expected to follow Semantic Versioning (https://semver.org/) -version: 0.2.0 +version: 0.3.0 # This is the version number of the application being deployed. This version number should be incremented each time you # make changes to the application. Versions are not expected to follow Semantic Versioning. They should reflect the # version the application is using. It is recommended to use it with quotes. @@ -47,3 +47,8 @@ dependencies: version: "2.13.3" condition: opensearch.enabled repository: https://opensearch-project.github.io/helm-charts + +- name: karpenter + version: "v0.29.2" + repository: oci://public.ecr.aws/karpenter + condition: karpenter.enabled diff --git a/charts/harmony-chart/templates/karpenter/node-template.yaml b/charts/harmony-chart/templates/karpenter/node-template.yaml new file mode 100644 index 0000000..2fcb9bc --- /dev/null +++ b/charts/harmony-chart/templates/karpenter/node-template.yaml @@ -0,0 +1,15 @@ +{{- if .Values.karpenter.enabled -}} +apiVersion: karpenter.k8s.aws/v1alpha1 +kind: AWSNodeTemplate +metadata: + name: {{ .Values.karpenter.nodeTemplate.name }} + annotations: + "helm.sh/hook": post-install,post-upgrade +spec: + subnetSelector: + karpenter.sh/discovery: {{ .Values.karpenter.settings.aws.clusterName }} + securityGroupSelector: + karpenter.sh/discovery: {{ .Values.karpenter.settings.aws.clusterName }} + tags: + karpenter.sh/discovery: {{ .Values.karpenter.settings.aws.clusterName }} +{{- end }} diff --git a/charts/harmony-chart/templates/karpenter/provisioner.yaml b/charts/harmony-chart/templates/karpenter/provisioner.yaml new file mode 100644 index 0000000..47b8ee8 --- /dev/null +++ b/charts/harmony-chart/templates/karpenter/provisioner.yaml @@ -0,0 +1,23 @@ +{{- if .Values.karpenter.enabled -}} +apiVersion: karpenter.sh/v1alpha5 +kind: Provisioner +metadata: + name: {{ .Values.karpenter.provisioner.name }} + annotations: + "helm.sh/hook": post-install,post-upgrade +spec: + {{- if .Values.karpenter.provisioner.spec.requirements }} + requirements: {{ toYaml .Values.karpenter.provisioner.spec.requirements | nindent 4 }} + {{- end }} + {{- if .Values.karpenter.provisioner.spec.limits.resources }} + limits: + resources: + {{- range $key, $value := .Values.karpenter.provisioner.spec.limits.resources }} + {{ $key }}: {{ $value | quote }} + {{- end }} + {{- end }} + providerRef: + name: {{ .Values.karpenter.nodeTemplate.name }} + ttlSecondsUntilExpired: {{ .Values.karpenter.provisioner.spec.ttlSecondsUntilExpired }} + ttlSecondsAfterEmpty: {{ .Values.karpenter.provisioner.spec.ttlSecondsAfterEmpty }} +{{- end }} diff --git a/charts/harmony-chart/values.yaml b/charts/harmony-chart/values.yaml index bcf37f4..d8e1418 100644 --- a/charts/harmony-chart/values.yaml +++ b/charts/harmony-chart/values.yaml @@ -183,3 +183,56 @@ opensearch: ".opendistro-notebooks", ".opendistro-asynchronous-search-response*", ] + +karpenter: + # add Karpenter node management for AWS EKS clusters. See: https://karpenter.sh/ + enabled: false + serviceAccount: + name: "karpenter" + annotations: + eks.amazonaws.com/role-arn: "" + settings: + aws: + # -- Cluster name. + clusterName: "" + # -- Cluster endpoint. If not set, will be discovered during startup (EKS only) + # From version 0.25.0, Karpenter helm chart allows the discovery of the cluster endpoint. More details in + # https://github.com/aws/karpenter/blob/main/website/content/en/docs/upgrade-guide.md#upgrading-to-v0250 + # clusterEndpoint: "" + # -- The default instance profile name to use when launching nodes + defaultInstanceProfile: "" + # -- interruptionQueueName is disabled if not specified. Enabling interruption handling may + # require additional permissions on the controller service account. + interruptionQueueName: "" + # --------------------------------------------------------------------------- + # Provide sensible defaults for resource provisioning and lifecycle + # --------------------------------------------------------------------------- + # Requirements for the provisioner API. + # More details in https://karpenter.sh/docs/concepts/provisioners/ + provisioner: + name: "default" + spec: + requirements: + - key: karpenter.sh/capacity-type + operator: In + values: ["spot"] + # - key: node.kubernetes.io/instance-type + # operator: In + # values: ["t3.large", "t3.xlarge", "t3.2xlarge", "t2.xlarge", "t2.2xlarge"] + # - key: kubernetes.io/arch + # operator: In + # values: ["amd64"] + # The limits section controls the maximum amount of resources that the provisioner will manage. + # More details in https://karpenter.sh/docs/concepts/provisioners/#speclimitsresources + limits: + resources: + cpu: "200" # 50 nodes * 4 cpu + memory: "800Gi" # 50 nodes * 16Gi + # TTL in seconds. If nil, the feature is disabled, nodes will never terminate + ttlSecondsUntilExpired: 2592000 + # TTL in seconds. If nil, the feature is disabled, nodes will never scale down + # due to low utilization. + ttlSecondsAfterEmpty: 30 + # Node template reference. More details in https://karpenter.sh/docs/concepts/node-templates/ + nodeTemplate: + name: "default" diff --git a/harmony-chart/charts/opensearch-2.11.4.tgz b/harmony-chart/charts/opensearch-2.11.4.tgz deleted file mode 100644 index 6151fd5..0000000 Binary files a/harmony-chart/charts/opensearch-2.11.4.tgz and /dev/null differ diff --git a/infra-examples/aws/README.md b/infra-examples/aws/README.md new file mode 100644 index 0000000..18039fb --- /dev/null +++ b/infra-examples/aws/README.md @@ -0,0 +1,33 @@ +# Reference Architecture for AWS + +This module includes Terraform modules to create AWS reference resources that are preconfigured to support Open edX as well as [Karpenter](https://karpenter.sh/) for management of [AWS EC2 spot-priced](https://aws.amazon.com/ec2/spot/) compute nodes and enhanced pod bin packing. + +## Virtual Private Cloud (VPC) + +There are no explicit requirements for Karpenter within this VPC defintion. However, there *are* several requirements for EKS which might vary from the VPC module defaults now or in the future. These include: + +- defined sets of subnets for both private and public networks +- a NAT gateway +- enabling DNS host names +- custom resource tags for public and private subnets +- explicit assignments of AWS region and availability zones + +See additional details here: [AWS VPC README](./vpc/README.rst) + +## Elastic Kubernetes Service (EKS) + +AWS EKS has grown more complex over time. This reference implementation is preconfigured as necessary to ensure that a.) you and others on your team can access the Kubernetes cluster both from the AWS Console as well as from kubectl, b.) it will work for an Open edX deployment, and c.) it will work with Karpenter. With these goals in mind, please note the following configuration details: + +- requirements detailed in the VPC section above are explicitly passed in to this module as inputs +- cluster endpoints for private and public access are enabled +- IAM Roles for Service Accounts (IRSA) is enabled +- Key Management Service (KMS) is enabled, encrypting all Kubernetes Secrets +- cluster access via aws-auth/configMap is enabled +- a karpenter.sh/discovery resource tag is added to the EKS instance +- various AWS EKS add-ons that are required by Open edX and/or Karpenter and/or its supporting systems (metrics-server, vpa) are included +- additional cluster node security configuration is added to allow node-to-node and pod-to-pod communication using internal DNS resolution +- a managed node group is added containing custom labels, IAM roles, and resource tags; all of which are required by Karpenter +- adds additional resources required by AWS EBS CSI Driver add-on, itself required by EKS since 1.22 +- additional EC2 security groups are added to enable pod shell access from kubectl + +See additional details here: [AWS EKS README](./k8s-cluster/README.rst) \ No newline at end of file diff --git a/infra-examples/aws/k8s-cluster/README.rst b/infra-examples/aws/k8s-cluster/README.rst new file mode 100644 index 0000000..04676d3 --- /dev/null +++ b/infra-examples/aws/k8s-cluster/README.rst @@ -0,0 +1,90 @@ +Amazon Elastic Kubernetes Service (EKS) +======================================= + +Implements a `Kubernetes Cluster `_ via `AWS Elastic Kubernetes Service (EKS) `_. A Kubernetes cluster is a set of nodes that run containerized applications that are grouped in pods and organized with namespaces. Containerizing an application into a Docker container means packaging that app with its dependences and its required services into a single binary run-time file that can be downloaded directly from the Docker registry. +Our Kubernetes Cluster resides inside the VPC on a private subnet, meaning that it is generally not visible to the public. In order to be able to receive traffic from the outside world we implement `Kubernetes Ingress Controllers `_ which in turn implement a `Kubernetes Ingress `_ +for both an `AWS Classic Load Balancer `_ as well as our `Nginx proxy server `_. + +**NOTE:** THIS MODULE DEPENDS ON THE TERRAFORM MODULE 'vpc' contained in the parent folder of this module. + +Implementation Strategy +----------------------- + +Our goal is to, as much as possible, implement a plain vanilla Kubernetes Cluster, pre-configured to use Karpenter, that generally uses all default configuration values and that allows EC2 as well as Fargate compute nodes. + +This module uses the latest version of the community-supported `AWS EKS Terraform module `_ to create a fully configured Kubernetes Cluster within the custom VPC. +AWS EKS Terraform module is widely supported and adopted, with more than 300 open source code contributers, and more than 21 million downloads from the Terraform registry as of March, 2023. + +How it works +------------ + +Amazon Elastic Kubernetes Service (Amazon EKS) is a managed container service to run and scale Kubernetes applications in the cloud. It is a managed service, meaning that AWS is responsible for up-time, and they apply periodic system updates and security patches automatically. + +.. image:: doc/diagram-eks.png + :width: 100% + :alt: EKS Diagram + + +AWS Fargate Serverless compute for containers +--------------------------------------------- + +AWS Fargate is a serverless, pay-as-you-go computing alternative to traditional EC2 instance-based computing nodes. It is compatible with both `Amazon Elastic Container Service (ECS) `_ and `Amazon Elastic Kubernetes Service (EKS) `_. +There are two distinct benefits to using Fargate instead of EC2 instances. First is cost. Similar to AWS Lambda, you only pay for the compute cycles that you consume. Most Open edX installations provision server infrastructure based on peak load estimates, which in point of fact only occur occasionally, during isolated events like approaching homework due dates, mid-term exams and so on. This in turn leads to EC2 instances being under-utilized most of the time. +Second, related, is scaling. Fargate can absorb whatever workload you send to it, meaning that during peak usage periods of your Open edX platform you won't need to worry about provisioning additional EC2 server capacity. + + +- **Running at scale**. Use Fargate with Amazon ECS or Amazon EKS to easily run and scale your containerized data processing workloads. +- **Optimize Costs**. With AWS Fargate there are no upfront expenses, pay for only the resources used. Further optimize with `Compute Savings Plans `_ and `Fargate Spot `_, then use `Graviton2 `_ powered Fargate for up to 40% price performance improvements. +- Only pay for what you use. Fargate scales the compute to closely match your specified resource requirements. With Fargate, there is no over-provisioning and paying for additional servers. + +How to Manually Add More Kubernetes Admins +------------------------------------------ + +By default your AWS IAM user account will be the only user who can view, interact with and manage your new Kubernetes cluster. Other IAM users with admin permissions will still need to be explicitly added to the list of Kluster admins. +If you're new to Kubernetes then you'll find detailed technical how-to instructions in the AWS EKS documentation, `Enabling IAM user and role access to your cluster `_. +You'll need kubectl in order to modify the aws-auth pod in your Kubernets cluster. + +**Note that since June-2022 the AWS EKS Kubernetes cluster configuration excludes public api access. This means that kubectl is only accessible via the bastion, from inside of the AWS VPC on the private subnets. +The convenience script /scripts/bastion-config.sh installs all of the Ubuntu packages and additional software that you'll need to connect to the k8s cluster using kubectl and k9s. You'll also need to +configure aws cli with an IAM key and secret with the requisite admin permissions.** + +.. code-block:: bash + + kubectl edit -n kube-system configmap/aws-auth + +Following is an example aws-auth configMap with additional IAM user accounts added to the admin "masters" group. + +.. code-block:: yaml + + # Please edit the object below. Lines beginning with a '#' will be ignored, + # and an empty file will abort the edit. If an error occurs while saving this file will be + # reopened with the relevant failures. + # + apiVersion: v1 + data: + mapRoles: | + - groups: + - system:bootstrappers + - system:nodes + rolearn: arn:aws:iam::012345678942:role/service-eks-node-group-20220518182244174100000002 + username: system:node:{{EC2PrivateDNSName}} + mapUsers: | + - groups: + - system:masters + userarn: arn:aws:iam::012345678942:user/lawrence.mcdaniel + username: lawrence.mcdaniel + - groups: + - system:masters + userarn: arn:aws:iam::012345678942:user/ci + username: ci + - groups: + - system:masters + userarn: arn:aws:iam::012345678942:user/bob_marley + username: bob_marley + kind: ConfigMap + metadata: + creationTimestamp: "2022-05-18T18:38:29Z" + name: aws-auth + namespace: kube-system + resourceVersion: "499488" + uid: 52d6e7fd-01b7-4c80-b831-b971507e5228 diff --git a/infra-examples/aws/k8s-cluster/addon_ebs_csi_driver.tf b/infra-examples/aws/k8s-cluster/addon_ebs_csi_driver.tf new file mode 100644 index 0000000..868acdd --- /dev/null +++ b/infra-examples/aws/k8s-cluster/addon_ebs_csi_driver.tf @@ -0,0 +1,77 @@ +#------------------------------------------------------------------------------ +# written by: Lawrence McDaniel +# https://lawrencemcdaniel.com/ +# +# date: Dec-2022 +# +# Create the Amazon EBS CSI driver IAM role for service accounts +# https://docs.aws.amazon.com/eks/latest/userguide/csi-iam-role.html +# +# Note: in late december 2022 the AWS EKS EBS CSI Add-on suddenly began +# inheriting its IAM role from the karpenter node group rather than using +# the role that is explicitly created and assigned here. no idea why. +# As a workaround, i'm also adding the AmazonEBSCSIDriverPolicy policy to the +# karpenter node group, which is assigned inside the eks module in main.tf. +#------------------------------------------------------------------------------ +resource "random_integer" "role_suffix" { + min = 10000 + max = 99999 +} + +data "aws_iam_policy" "AmazonEBSCSIDriverPolicy" { + arn = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy" +} + +# 2. Create the IAM role. +resource "aws_iam_role" "AmazonEKS_EBS_CSI_DriverRole" { + name = "AmazonEKS_EBS_CSI_DriverRole-${random_integer.role_suffix.result}" + assume_role_policy = jsonencode({ + "Version" : "2012-10-17", + "Statement" : [ + { + "Effect" : "Allow", + "Principal" : { + "Federated" : "arn:aws:iam::${var.account_id}:oidc-provider/${module.eks.oidc_provider}" + }, + "Action" : "sts:AssumeRoleWithWebIdentity", + "Condition" : { + "StringEquals" : { + "${module.eks.oidc_provider}:aud" : "sts.amazonaws.com", + "${module.eks.oidc_provider}:sub" : "system:serviceaccount:kube-system:ebs-csi-controller-sa" + } + } + } + ] + }) + tags = local.tags +} + +# 3. Attach the required AWS managed policy to the role +resource "aws_iam_role_policy_attachment" "aws_ebs_csi_driver" { + role = aws_iam_role.AmazonEKS_EBS_CSI_DriverRole.name + policy_arn = data.aws_iam_policy.AmazonEBSCSIDriverPolicy.arn +} + +# 5. Annotate the ebs-csi-controller-sa Kubernetes service account with the ARN of the IAM role +# 6. Restart the ebs-csi-controller deployment for the annotation to take effect +resource "null_resource" "annotate-ebs-csi-controller" { + provisioner "local-exec" { + command = <<-EOT + # 1. conifugre kubeconfig locally with the credentials data of the just-created + # kubernetes cluster. + # --------------------------------------- + aws eks --region ${var.aws_region} update-kubeconfig --name ${var.name} --alias ${var.name} + kubectl config use-context ${var.name} + kubectl config set-context --current --name=kube-system + + # 2. final install steps for EBS CSI Driver + # --------------------------------------- + kubectl annotate serviceaccount ebs-csi-controller-sa -n kube-system eks.amazonaws.com/role-arn=arn:aws:iam::${var.account_id}:role/${aws_iam_role.AmazonEKS_EBS_CSI_DriverRole.name} + kubectl rollout restart deployment ebs-csi-controller -n kube-system + EOT + } + + depends_on = [ + module.eks + ] +} diff --git a/infra-examples/aws/k8s-cluster/doc/aws-vpc-eks.png b/infra-examples/aws/k8s-cluster/doc/aws-vpc-eks.png new file mode 100644 index 0000000..74e1bf8 Binary files /dev/null and b/infra-examples/aws/k8s-cluster/doc/aws-vpc-eks.png differ diff --git a/infra-examples/aws/k8s-cluster/doc/diagram-eks.png b/infra-examples/aws/k8s-cluster/doc/diagram-eks.png new file mode 100644 index 0000000..0f79164 Binary files /dev/null and b/infra-examples/aws/k8s-cluster/doc/diagram-eks.png differ diff --git a/infra-examples/aws/k8s-cluster/doc/diagram-fargate.png b/infra-examples/aws/k8s-cluster/doc/diagram-fargate.png new file mode 100644 index 0000000..22e2d8a Binary files /dev/null and b/infra-examples/aws/k8s-cluster/doc/diagram-fargate.png differ diff --git a/infra-examples/aws/k8s-cluster/doc/node_group-diagram.jpeg b/infra-examples/aws/k8s-cluster/doc/node_group-diagram.jpeg new file mode 100644 index 0000000..9219b97 Binary files /dev/null and b/infra-examples/aws/k8s-cluster/doc/node_group-diagram.jpeg differ diff --git a/infra-examples/aws/k8s-cluster/doc/node_security_group_additional_rules.png b/infra-examples/aws/k8s-cluster/doc/node_security_group_additional_rules.png new file mode 100644 index 0000000..3bc93e5 Binary files /dev/null and b/infra-examples/aws/k8s-cluster/doc/node_security_group_additional_rules.png differ diff --git a/infra-examples/aws/k8s-cluster/main.tf b/infra-examples/aws/k8s-cluster/main.tf new file mode 100644 index 0000000..924db46 --- /dev/null +++ b/infra-examples/aws/k8s-cluster/main.tf @@ -0,0 +1,239 @@ +#------------------------------------------------------------------------------ +# written by: Lawrence McDaniel +# https://lawrencemcdaniel.com/ +# +# date: Mar-2022 +# +# usage: create an EKS cluster with one managed node group for EC2 +# plus a Fargate profile for serverless computing. +# +# Technical documentation: +# - https://docs.aws.amazon.com/kubernetes +# - https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/ +# +#------------------------------------------------------------------------------ + +locals { + # Used by Karpenter config to determine correct partition (i.e. - `aws`, `aws-gov`, `aws-cn`, etc.) + partition = data.aws_partition.current.partition + + tags = { + "Name" = var.name + "openedx-k8s-harmony/name" = var.name + "openedx-k8s-harmony/region" = var.aws_region + "openedx-k8s-harmony/terraform" = "true" + } + +} + +module "eks" { + source = "terraform-aws-modules/eks/aws" + version = "~> 19.13" + cluster_name = var.name + cluster_version = var.kubernetes_cluster_version + cluster_endpoint_private_access = true + cluster_endpoint_public_access = true + vpc_id = data.aws_vpc.reference.id + subnet_ids = data.aws_subnets.private.ids + create_cloudwatch_log_group = false + enable_irsa = true + + # NOTE: + # by default Kubernetes secrets are encrypted with this key. Add your IAM + # user ARN to the owner list in order to be able to view secrets. + # AWS EKS KMS console: https://us-east-2.console.aws.amazon.com/kms/home + # + # audit your AWS EKS KMS key access by running: + # aws kms get-key-policy --key-id ADD-YOUR-KEY-ID-HERE --region us-east-2 --policy-name default --output text + create_kms_key = var.eks_create_kms_key + kms_key_owners = var.kms_key_owners + + # Add your IAM user ARN to aws_auth_users in order to gain access to the cluster itself. + # Note that alternatively, the cluster creator (presumably, you) can edit the manifest + # for kube-system/aws-auth configMap, adding additional users and roles as needed. + # see: + manage_aws_auth_configmap = true + aws_auth_users = var.map_users + + tags = merge( + local.tags, + # Tag node group resources for Karpenter auto-discovery + # NOTE - if creating multiple security groups with this module, only tag the + # security group that Karpenter should utilize with the following tag + { "karpenter.sh/discovery" = var.name } + ) + + # AWS EKS add-ons that are required in order to support persistent volume + # claims for ElasticSearch and Caddy (if you opt for this rather than nginx). + # Other addons are required by Karpenter and other optional supporting services. + # + # see: https://docs.aws.amazon.com/eks/latest/userguide/eks-add-ons.html + cluster_addons = { + # required to support internal networking between containers + vpc-cni = { + name = "vpc-cni" + } + # required to support internal DNS name resolution within the cluster + coredns = { + name = "coredns" + } + # required to maintain network rules on nodes and to enable internal + # network communication between pods. + kube-proxy = { + name = "kube-proxy" + } + # Required for release 1.22 and newer in order to support persistent volume + # claims for ElasticSearch and Caddy (if you opt for this rather than nginx). + aws-ebs-csi-driver = { + name = "aws-ebs-csi-driver" + service_account_role_arn = aws_iam_role.AmazonEKS_EBS_CSI_DriverRole.arn + } + } + + # to enable internal https network communication between nodes. + node_security_group_additional_rules = { + ingress_self_all = { + description = "openedx-k8s-harmony: Node to node all ports/protocols" + protocol = "-1" + from_port = 0 + to_port = 0 + type = "ingress" + cidr_blocks = [ + "172.16.0.0/12", + "192.168.0.0/16", + ] + } + port_8443 = { + description = "openedx-k8s-harmony: open port 8443 to vpc" + protocol = "-1" + from_port = 8443 + to_port = 8443 + type = "ingress" + source_node_security_group = true + } + egress_all = { + description = "openedx-k8s-harmony: Node all egress" + protocol = "-1" + from_port = 0 + to_port = 0 + type = "egress" + cidr_blocks = ["0.0.0.0/0"] + ipv6_cidr_blocks = ["::/0"] + } + } + + eks_managed_node_groups = { + # This node group is managed by Karpenter. There must be at least one + # node in this group at all times in order for Karpenter to monitor + # load and act on metrics data. Karpenter's bin packing algorithms + # perform more effectively with larger instance types. The default + # instance type is t3.large (2 vCPU / 8 GiB). These instances, + # beyond the 1 permanent instance, are assumed to be short-lived + # (a few hours or less) as these are usually only instantiated during + # bursts of user activity such as at the start of a scheduled lecture or + # exam on a large mooc. + service = { + capacity_type = "ON_DEMAND" + enable_monitoring = false + desired_size = var.eks_service_group_desired_size + min_size = var.eks_service_group_min_size + max_size = var.eks_service_group_max_size + + # for node affinity + labels = { + node-group = "service" + } + + iam_role_additional_policies = { + # Required by Karpenter + AmazonSSMManagedInstanceCore = "arn:${local.partition}:iam::aws:policy/AmazonSSMManagedInstanceCore" + + # Required by EBS CSI Add-on + AmazonEBSCSIDriverPolicy = data.aws_iam_policy.AmazonEBSCSIDriverPolicy.arn + } + + instance_types = ["${var.eks_service_group_instance_type}"] + tags = merge( + local.tags, + { Name = "eks-${var.shared_resource_identifier}" } + ) + } + + } +} + +#------------------------------------------------------------------------------ +# KARPENTER RESOURCES +#------------------------------------------------------------------------------ +# See more details in +# https://github.com/terraform-aws-modules/terraform-aws-eks/blob/v19.16.0/modules/karpenter/README.md#external-node-iam-role-default +module "karpenter" { + source = "terraform-aws-modules/eks/aws//modules/karpenter" + version = "~> 19.16" + + cluster_name = module.eks.cluster_name + + irsa_oidc_provider_arn = module.eks.oidc_provider_arn + irsa_namespace_service_accounts = ["karpenter:karpenter", "harmony:karpenter"] + + # Since Karpenter is running on an EKS Managed Node group, + # we can re-use the role that was created for the node group + create_iam_role = false + iam_role_arn = module.eks.eks_managed_node_groups["service"].iam_role_arn + + # Disable Spot termination + enable_spot_termination = false + + tags = local.tags +} + +#------------------------------------------------------------------------------ +# SUPPORTING RESOURCES +#------------------------------------------------------------------------------ + +# add an AWS IAM Role definition providing AWS console access to +# AWS EKS cluster instances. +resource "kubectl_manifest" "eks-console-full-access" { + yaml_body = templatefile("${path.module}/yml/eks-console-full-access.yaml", {}) +} + +# to enable shell access to nodes from kubectl +resource "aws_security_group" "worker_group_mgmt" { + name_prefix = "${var.name}-eks_hosting_group_mgmt" + description = "openedx-k8s-harmony: Ingress CLB worker group management" + vpc_id = data.aws_vpc.reference.id + + ingress { + description = "openedx-k8s-harmony: Ingress CLB" + from_port = 22 + to_port = 22 + protocol = "tcp" + + cidr_blocks = [ + "10.0.0.0/8", + ] + } + + tags = local.tags +} + +resource "aws_security_group" "all_worker_mgmt" { + name_prefix = "${var.name}-eks_all_worker_management" + description = "openedx-k8s-harmony: Ingress CLB worker management" + vpc_id = data.aws_vpc.reference.id + + ingress { + description = "openedx-k8s-harmony: Ingress CLB" + from_port = 22 + to_port = 22 + protocol = "tcp" + + cidr_blocks = [ + "10.0.0.0/8", + "172.16.0.0/12", + "192.168.0.0/16", + ] + } + + tags = local.tags +} diff --git a/infra-examples/aws/k8s-cluster/outputs.tf b/infra-examples/aws/k8s-cluster/outputs.tf new file mode 100644 index 0000000..0bfb8ae --- /dev/null +++ b/infra-examples/aws/k8s-cluster/outputs.tf @@ -0,0 +1,205 @@ +#------------------------------------------------------------------------------ +# written by: Lawrence McDaniel +# https://lawrencemcdaniel.com/ +# +# date: Mar-2022 +# +# usage: create an EKS cluster +#------------------------------------------------------------------------------ + +################################################################################ +# Cluster +################################################################################ + +output "cluster_arn" { + description = "The Amazon Resource Name (ARN) of the cluster" + value = module.eks.cluster_arn +} + +output "cluster_certificate_authority_data" { + description = "Base64 encoded certificate data required to communicate with the cluster" + value = module.eks.cluster_certificate_authority_data +} + +output "cluster_endpoint" { + description = "Endpoint for your Kubernetes API server" + value = module.eks.cluster_endpoint +} + +output "cluster_name" { + description = "The name/id of the EKS cluster. Will block on cluster creation until the cluster is really ready" + value = module.eks.cluster_name +} + +output "cluster_oidc_issuer_url" { + description = "The URL on the EKS cluster for the OpenID Connect identity provider" + value = module.eks.cluster_oidc_issuer_url +} + +output "cluster_platform_version" { + description = "Platform version for the cluster" + value = module.eks.cluster_platform_version +} + +output "cluster_status" { + description = "Status of the EKS cluster. One of `CREATING`, `ACTIVE`, `DELETING`, `FAILED`" + value = module.eks.cluster_status +} + +output "cluster_primary_security_group_id" { + description = "Cluster security group that was created by Amazon EKS for the cluster. Managed node groups use this security group for control-plane-to-data-plane communication. Referred to as 'Cluster security group' in the EKS console" + value = module.eks.cluster_primary_security_group_id +} + +################################################################################ +# Karpenter +################################################################################ + +output "karpenter_irsa_role_arn" { + description = "IRSA role created by the Karpenter module" + value = module.karpenter.irsa_arn +} + +output "karpenter_instance_profile_name" { + description = "Instance profile created by the Karpenter module" + value = module.karpenter.instance_profile_name +} + +################################################################################ +# Security Group +################################################################################ + +output "cluster_security_group_arn" { + description = "Amazon Resource Name (ARN) of the cluster security group" + value = module.eks.cluster_security_group_arn +} + +output "cluster_security_group_id" { + description = "ID of the cluster security group" + value = module.eks.cluster_security_group_id +} + +################################################################################ +# Node Security Group +################################################################################ + +output "node_security_group_arn" { + description = "Amazon Resource Name (ARN) of the node shared security group" + value = module.eks.node_security_group_arn +} + +output "node_security_group_id" { + description = "ID of the node shared security group" + value = module.eks.node_security_group_id +} + +################################################################################ +# IRSA +################################################################################ + +output "oidc_provider" { + description = "The OpenID Connect identity provider (issuer URL without leading `https://`)" + value = module.eks.oidc_provider +} + +output "oidc_provider_arn" { + description = "The ARN of the OIDC Provider if `enable_irsa = true`" + value = module.eks.oidc_provider_arn +} + +################################################################################ +# IAM Role +################################################################################ + +output "cluster_iam_role_name" { + description = "IAM role name of the EKS cluster" + value = module.eks.cluster_iam_role_name +} + +output "cluster_iam_role_arn" { + description = "IAM role ARN of the EKS cluster" + value = module.eks.cluster_iam_role_arn +} + +output "cluster_iam_role_unique_id" { + description = "Stable and unique string identifying the IAM role" + value = module.eks.cluster_iam_role_unique_id +} + +################################################################################ +# EKS Addons +################################################################################ + +output "cluster_addons" { + description = "Map of attribute maps for all EKS cluster addons enabled" + value = module.eks.cluster_addons +} + +################################################################################ +# EKS Identity Provider +################################################################################ + +output "cluster_identity_providers" { + description = "Map of attribute maps for all EKS identity providers enabled" + value = module.eks.cluster_identity_providers +} + +################################################################################ +# CloudWatch Log Group +################################################################################ + +output "cloudwatch_log_group_name" { + description = "Name of cloudwatch log group created" + value = module.eks.cloudwatch_log_group_name +} + +output "cloudwatch_log_group_arn" { + description = "Arn of cloudwatch log group created" + value = module.eks.cloudwatch_log_group_arn +} + +################################################################################ +# Fargate Profile +################################################################################ + +output "fargate_profiles" { + description = "Map of attribute maps for all EKS Fargate Profiles created" + value = module.eks.fargate_profiles +} + +################################################################################ +# EKS Managed Node Groups +################################################################################ +output "service_node_group_iam_role_name" { + value = module.eks.eks_managed_node_groups["service"].iam_role_name +} + +output "service_node_group_iam_role_arn" { + value = module.eks.eks_managed_node_groups["service"].iam_role_arn +} +output "eks_managed_node_groups" { + description = "Map of attribute maps for all EKS managed node groups created" + value = module.eks.eks_managed_node_groups +} + +################################################################################ +# Self Managed Node Group +################################################################################ + +output "self_managed_node_groups" { + description = "Map of attribute maps for all self managed node groups created" + value = module.eks.self_managed_node_groups +} + +################################################################################ +# Additional +################################################################################ + +output "aws_auth_configmap_yaml" { + description = "Formatted yaml output for base aws-auth configmap containing roles used in cluster node groups/fargate profiles" + value = module.eks.aws_auth_configmap_yaml +} + +################################################################################ +# ELB +################################################################################ diff --git a/infra-examples/aws/k8s-cluster/providers.tf b/infra-examples/aws/k8s-cluster/providers.tf new file mode 100644 index 0000000..23a1770 --- /dev/null +++ b/infra-examples/aws/k8s-cluster/providers.tf @@ -0,0 +1,58 @@ +#------------------------------------------------------------------------------ +# written by: Lawrence McDaniel +# https://lawrencemcdaniel.com/ +# +# date: Aug-2022 +# +# usage: all providers for Kubernetes and its sub-systems. The general strategy +# is to manage authentications via aws cli where possible, simply to limit +# the environment requirements in order to get this module to work. +# +# another alternative for each of the providers would be to rely on +# the local kubeconfig file. +#------------------------------------------------------------------------------ + +# Required by Karpenter +data "aws_partition" "current" {} + +# Configure the AWS Provider +provider "aws" { + region = var.aws_region +} + +provider "kubernetes" { + host = module.eks.cluster_endpoint + cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data) + + exec { + api_version = "client.authentication.k8s.io/v1beta1" + command = "aws" + args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name] + } +} + +# Required by Karpenter and metrics-server +provider "kubectl" { + host = module.eks.cluster_endpoint + cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data) + + exec { + api_version = "client.authentication.k8s.io/v1beta1" + command = "aws" + args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name] + } +} + +# Required by Karpenter and metrics-server +provider "helm" { + kubernetes { + host = module.eks.cluster_endpoint + cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data) + + exec { + api_version = "client.authentication.k8s.io/v1beta1" + command = "aws" + args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name] + } + } +} diff --git a/infra-examples/aws/k8s-cluster/variables.tf b/infra-examples/aws/k8s-cluster/variables.tf new file mode 100644 index 0000000..2febbf8 --- /dev/null +++ b/infra-examples/aws/k8s-cluster/variables.tf @@ -0,0 +1,118 @@ +#------------------------------------------------------------------------------ +# written by: Lawrence McDaniel +# https://lawrencemcdaniel.com/ +# +# date: Mar-2022 +# +# usage: create an EKS cluster +#------------------------------------------------------------------------------ +variable "account_id" { + description = "a 12-digit AWS account id, all integers. example: 012345678999" + type = string +} + +variable "shared_resource_identifier" { + description = "a prefix to add to all resource names associated with this Kubernetes cluster instance" + type = string + default = "" +} + +variable "name" { + description = "a valid Kubernetes name definition" + type = string + default = "openedx-k8s-harmony" +} + +variable "aws_region" { + description = "the AWS region code. example: us-east-1. see https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html" + type = string + default = "us-east-1" +} + + +variable "enable_irsa" { + description = "true to create an OpenID Connect Provider for EKS to enable IRSA (IAM Roles for Service Accounts)." + type = bool + default = true +} + +variable "kubernetes_cluster_version" { + description = "the Kubernetes release for this cluster" + type = string + default = "1.27" +} + +variable "eks_create_kms_key" { + description = "true to create an AWS Key Management Service (KMS) key for encryption of all Kubernetes secrets in this cluster." + type = bool + default = true +} + +variable "eks_service_group_instance_type" { + description = "AWS EC2 instance type to deploy into the 'service' AWS EKS Managed Node Group" + type = string + default = "t3.large" +} + +variable "eks_service_group_min_size" { + description = "The minimum number of AWS EC2 instance nodes to run in the 'service' AWS EKS Managed Node Group" + type = number + default = 3 +} + +variable "eks_service_group_max_size" { + description = "The maximum number of AWS EC2 instance nodes to run in the 'service' AWS EKS Managed Node Group" + type = number + default = 3 +} + +variable "eks_service_group_desired_size" { + description = "Only read during cluster creation. The desired number of AWS EC2 instance nodes to run in the 'service' AWS EKS Managed Node Group" + type = number + default = 3 +} + +# sample data: +# ----------------------------------------------------------------------------- +# map_users = [ +# { +# userarn = "arn:aws:iam::012345678999:user/mcdaniel" +# username = "mcdaniel" +# groups = ["system:masters"] +# }, +# { +# userarn = "arn:aws:iam::012345678999:user/bob_marley" +# username = "bob_marley" +# groups = ["system:masters"] +# }, +#] +variable "map_users" { + description = "Additional IAM users to add to the aws-auth configmap." + type = list(object({ + userarn = string + username = string + groups = list(string) + })) + default = [] +} + +variable "map_roles" { + description = "Additional IAM roles to add to the aws-auth configmap." + type = list(object({ + userarn = string + username = string + groups = list(string) + })) + default = [] +} + +# sample data: +# ----------------------------------------------------------------------------- +# kms_key_owners = [ +# "arn:aws:iam::012345678999:user/mcdaniel", +# "arn:aws:iam::012345678999:user/bob_marley", +# ] +variable "kms_key_owners" { + type = list(any) + default = [] +} diff --git a/infra-examples/aws/k8s-cluster/versions.tf b/infra-examples/aws/k8s-cluster/versions.tf new file mode 100644 index 0000000..7b4ae0a --- /dev/null +++ b/infra-examples/aws/k8s-cluster/versions.tf @@ -0,0 +1,38 @@ +#------------------------------------------------------------------------------ +# written by: Lawrence McDaniel +# https://lawrencemcdaniel.com/ +# +# date: Mar-2022 +# +# usage: create an EKS cluster +#------------------------------------------------------------------------------ +terraform { + required_version = "~> 1.3" + + required_providers { + local = { + source = "hashicorp/local" + version = "~> 2.4" + } + random = { + source = "hashicorp/random" + version = "~> 3.5" + } + aws = { + source = "hashicorp/aws" + version = "~> 4.65" + } + kubectl = { + source = "gavinbunney/kubectl" + version = "~> 1.14" + } + helm = { + source = "hashicorp/helm" + version = "~> 2.9" + } + kubernetes = { + source = "hashicorp/kubernetes" + version = "~> 2.20" + } + } +} diff --git a/infra-examples/aws/k8s-cluster/vpc.tf b/infra-examples/aws/k8s-cluster/vpc.tf new file mode 100644 index 0000000..13ea2e9 --- /dev/null +++ b/infra-examples/aws/k8s-cluster/vpc.tf @@ -0,0 +1,19 @@ +data "aws_vpc" "reference" { + filter { + name = "tag:Name" + values = [var.name] + } +} + +data "aws_subnets" "private" { + + filter { + name = "vpc-id" + values = [data.aws_vpc.reference.id] + } + + filter { + name = "tag:Type" + values = ["private"] + } +} diff --git a/infra-examples/aws/k8s-cluster/yml/eks-console-full-access.yaml b/infra-examples/aws/k8s-cluster/yml/eks-console-full-access.yaml new file mode 100644 index 0000000..0f9e5a6 --- /dev/null +++ b/infra-examples/aws/k8s-cluster/yml/eks-console-full-access.yaml @@ -0,0 +1,44 @@ +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: eks-console-dashboard-full-access-clusterrole +rules: + - apiGroups: + - "" + resources: + - nodes + - namespaces + - pods + verbs: + - get + - list + - apiGroups: + - apps + resources: + - deployments + - daemonsets + - statefulsets + - replicasets + verbs: + - get + - list + - apiGroups: + - batch + resources: + - jobs + verbs: + - get + - list +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: eks-console-dashboard-full-access-binding +subjects: + - kind: Group + name: eks-console-dashboard-full-access-group + apiGroup: rbac.authorization.k8s.io +roleRef: + kind: ClusterRole + name: eks-console-dashboard-full-access-clusterrole + apiGroup: rbac.authorization.k8s.io diff --git a/infra-examples/aws/vpc/README.rst b/infra-examples/aws/vpc/README.rst new file mode 100644 index 0000000..a5ef222 --- /dev/null +++ b/infra-examples/aws/vpc/README.rst @@ -0,0 +1,20 @@ +Reference Infrastructure for AWS Virtual Private Cloud (VPC) +============================================================ + +Implements an `AWS Virtual Private Cloud `_ this is preconfigured to support an AWS Elastic Kubernetes Cluster. Amazon Virtual Private Cloud (Amazon VPC) enables you to launch AWS resources into a virtual network that you've defined. This virtual network closely resembles a traditional network that you'd operate in your own data center, with the benefits of using the scalable infrastructure of AWS. + +Implementation Strategy +----------------------- + +Our goal is to, as much as possible, implement a plain vanilla VPC that pre-configured as necessary to support an AWS Elastic Kubernetes Service instance. It generally uses all default configuration values. + +This module uses the latest version of the community-supported `AWS VPC Terraform module `_ to create a fully configured Virtual Private Cloud within your AWS account. +AWS VPC Terraform module is widely supported and adopted, with more than 100 open source code contributers, and more than 37 million downloads from the Terraform registry as of March, 2023. + +What it creates +~~~~~~~~~~~~~~~ + +.. image:: doc/aws-vpc-eks.png + :width: 100% + :alt: Virtual Private Cloud Diagram + diff --git a/infra-examples/aws/vpc/doc/aws-vpc-eks.png b/infra-examples/aws/vpc/doc/aws-vpc-eks.png new file mode 100644 index 0000000..74e1bf8 Binary files /dev/null and b/infra-examples/aws/vpc/doc/aws-vpc-eks.png differ diff --git a/infra-examples/aws/vpc/main.tf b/infra-examples/aws/vpc/main.tf new file mode 100644 index 0000000..77638d1 --- /dev/null +++ b/infra-examples/aws/vpc/main.tf @@ -0,0 +1,67 @@ +#------------------------------------------------------------------------------ +# written by: Lawrence McDaniel +# https://lawrencemcdaniel.com +# +# date: mar-2022 +# +# usage: create a VPC to contain all Open edX backend resources. +# this VPC is configured to generally use all AWS defaults. +# Thus, you should get the same configuration here that you'd +# get by creating a new VPC from the AWS Console. +# +# There are a LOT of options in this module. +# see https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws/latest +#------------------------------------------------------------------------------ +locals { + azs = ["${var.aws_region}a", "${var.aws_region}b", "${var.aws_region}c"] + + # a bit of foreshadowing: + # AWS EKS uses tags for identifying resources which it interacts. + # here we are tagging the public and private subnets with specially-named tags + # that EKS uses to know where its public and internal load balancers should be placed. + # + # these tags are required, regardless of whether we're using EKS with EC2 worker nodes + # or with a Fargate Compute Cluster. + public_subnet_tags = { + "Type" = "public" + "kubernetes.io/cluster/${var.name}" = "shared" + "kubernetes.io/role/elb" = "1" + } + + private_subnet_tags = { + "Type" = "private" + "kubernetes.io/cluster/${var.name}" = "shared" + "kubernetes.io/role/internal-elb" = "1" + "karpenter.sh/discovery" = var.name + } + + tags = { + "Name" = var.name + "openedx-k8s-harmony/name" = var.name + "openedx-k8s-harmony/region" = var.aws_region + "openedx-k8s-harmony/terraform" = "true" + } + + +} + +module "vpc" { + source = "terraform-aws-modules/vpc/aws" + version = "~> 4.0" + create_vpc = true + azs = local.azs + public_subnet_tags = local.public_subnet_tags + private_subnet_tags = local.private_subnet_tags + tags = local.tags + name = var.name + cidr = var.cidr + public_subnets = var.public_subnets + private_subnets = var.private_subnets + database_subnets = var.database_subnets + elasticache_subnets = var.elasticache_subnets + enable_ipv6 = var.enable_ipv6 + enable_dns_hostnames = var.enable_dns_hostnames + enable_nat_gateway = var.enable_nat_gateway + single_nat_gateway = var.single_nat_gateway + one_nat_gateway_per_az = var.one_nat_gateway_per_az +} diff --git a/infra-examples/aws/vpc/outputs.tf b/infra-examples/aws/vpc/outputs.tf new file mode 100644 index 0000000..1bc6ef7 --- /dev/null +++ b/infra-examples/aws/vpc/outputs.tf @@ -0,0 +1,550 @@ +#------------------------------------------------------------------------------ +# written by: Miguel Afonso +# https://www.linkedin.com/in/mmafonso/ +# +# date: Aug-2021 +# +# usage: create a VPC to contain all Open edX backend resources. +#------------------------------------------------------------------------------ + +output "vpc_id" { + description = "The ID of the VPC" + value = module.vpc.vpc_id +} + +output "vpc_arn" { + description = "The ARN of the VPC" + value = module.vpc.vpc_arn +} + +output "vpc_cidr_block" { + description = "The CIDR block of the VPC" + value = module.vpc.vpc_cidr_block +} + +output "default_security_group_id" { + description = "The ID of the security group created by default on VPC creation" + value = module.vpc.default_security_group_id +} + +output "default_network_acl_id" { + description = "The ID of the default network ACL" + value = module.vpc.default_network_acl_id +} + +output "default_route_table_id" { + description = "The ID of the default route table" + value = module.vpc.default_route_table_id +} + +output "vpc_instance_tenancy" { + description = "Tenancy of instances spin up within VPC" + value = module.vpc.vpc_instance_tenancy +} + +output "vpc_enable_dns_support" { + description = "Whether or not the VPC has DNS support" + value = module.vpc.vpc_enable_dns_support +} + +output "vpc_enable_dns_hostnames" { + description = "Whether or not the VPC has DNS hostname support" + value = module.vpc.vpc_enable_dns_hostnames +} + +output "vpc_main_route_table_id" { + description = "The ID of the main route table associated with this VPC" + value = module.vpc.vpc_main_route_table_id +} + +output "vpc_ipv6_association_id" { + description = "The association ID for the IPv6 CIDR block" + value = module.vpc.vpc_ipv6_association_id +} + +output "vpc_ipv6_cidr_block" { + description = "The IPv6 CIDR block" + value = module.vpc.vpc_ipv6_cidr_block +} + +output "vpc_secondary_cidr_blocks" { + description = "List of secondary CIDR blocks of the VPC" + value = module.vpc.vpc_secondary_cidr_blocks +} + +output "vpc_owner_id" { + description = "The ID of the AWS account that owns the VPC" + value = module.vpc.vpc_owner_id +} + +output "private_subnets" { + description = "List of IDs of private subnets" + value = module.vpc.private_subnets +} + +output "private_subnet_arns" { + description = "List of ARNs of private subnets" + value = module.vpc.private_subnet_arns +} + +output "private_subnets_cidr_blocks" { + description = "List of cidr_blocks of private subnets" + value = module.vpc.private_subnets_cidr_blocks +} + +output "private_subnets_ipv6_cidr_blocks" { + description = "List of IPv6 cidr_blocks of private subnets in an IPv6 enabled VPC" + value = module.vpc.private_subnets_ipv6_cidr_blocks +} + +output "public_subnets" { + description = "List of IDs of public subnets" + value = module.vpc.public_subnets +} + +output "public_subnet_arns" { + description = "List of ARNs of public subnets" + value = module.vpc.public_subnet_arns +} + +output "public_subnets_cidr_blocks" { + description = "List of cidr_blocks of public subnets" + value = module.vpc.public_subnets_cidr_blocks +} + +output "public_subnets_ipv6_cidr_blocks" { + description = "List of IPv6 cidr_blocks of public subnets in an IPv6 enabled VPC" + value = module.vpc.public_subnets_ipv6_cidr_blocks +} + +output "outpost_subnets" { + description = "List of IDs of outpost subnets" + value = module.vpc.outpost_subnets +} + +output "outpost_subnet_arns" { + description = "List of ARNs of outpost subnets" + value = module.vpc.outpost_subnet_arns +} + +output "outpost_subnets_cidr_blocks" { + description = "List of cidr_blocks of outpost subnets" + value = module.vpc.outpost_subnets_cidr_blocks +} + +output "outpost_subnets_ipv6_cidr_blocks" { + description = "List of IPv6 cidr_blocks of outpost subnets in an IPv6 enabled VPC" + value = module.vpc.outpost_subnets_ipv6_cidr_blocks +} + +output "database_subnets" { + description = "List of IDs of database subnets" + value = module.vpc.database_subnets +} + +output "database_subnet_arns" { + description = "List of ARNs of database subnets" + value = module.vpc.database_subnet_arns +} + +output "database_subnets_cidr_blocks" { + description = "List of cidr_blocks of database subnets" + value = module.vpc.database_subnets_cidr_blocks +} + +output "database_subnets_ipv6_cidr_blocks" { + description = "List of IPv6 cidr_blocks of database subnets in an IPv6 enabled VPC" + value = module.vpc.database_subnets_ipv6_cidr_blocks +} + +output "database_subnet_group" { + description = "ID of database subnet group" + value = module.vpc.database_subnet_group +} + +output "database_subnet_group_name" { + description = "Name of database subnet group" + value = module.vpc.database_subnet_group_name +} + +output "redshift_subnets" { + description = "List of IDs of redshift subnets" + value = module.vpc.redshift_subnets +} + +output "redshift_subnet_arns" { + description = "List of ARNs of redshift subnets" + value = module.vpc.redshift_subnet_arns +} + +output "redshift_subnets_cidr_blocks" { + description = "List of cidr_blocks of redshift subnets" + value = module.vpc.redshift_subnets_cidr_blocks +} + +output "redshift_subnets_ipv6_cidr_blocks" { + description = "List of IPv6 cidr_blocks of redshift subnets in an IPv6 enabled VPC" + value = module.vpc.redshift_subnets_ipv6_cidr_blocks +} + +output "redshift_subnet_group" { + description = "ID of redshift subnet group" + value = module.vpc.redshift_subnet_group +} + +output "elasticache_subnets" { + description = "List of IDs of elasticache subnets" + value = module.vpc.elasticache_subnets +} + +output "elasticache_subnet_arns" { + description = "List of ARNs of elasticache subnets" + value = module.vpc.elasticache_subnet_arns +} + +output "elasticache_subnets_cidr_blocks" { + description = "List of cidr_blocks of elasticache subnets" + value = module.vpc.elasticache_subnets_cidr_blocks +} + +output "elasticache_subnets_ipv6_cidr_blocks" { + description = "List of IPv6 cidr_blocks of elasticache subnets in an IPv6 enabled VPC" + value = module.vpc.elasticache_subnets_ipv6_cidr_blocks +} + +output "intra_subnets" { + description = "List of IDs of intra subnets" + value = module.vpc.intra_subnets +} + +output "intra_subnet_arns" { + description = "List of ARNs of intra subnets" + value = module.vpc.intra_subnet_arns +} + +output "intra_subnets_cidr_blocks" { + description = "List of cidr_blocks of intra subnets" + value = module.vpc.intra_subnets_cidr_blocks +} + +output "intra_subnets_ipv6_cidr_blocks" { + description = "List of IPv6 cidr_blocks of intra subnets in an IPv6 enabled VPC" + value = module.vpc.intra_subnets_ipv6_cidr_blocks +} + +output "elasticache_subnet_group" { + description = "ID of elasticache subnet group" + value = module.vpc.elasticache_subnet_group +} + +output "elasticache_subnet_group_name" { + description = "Name of elasticache subnet group" + value = module.vpc.elasticache_subnet_group_name +} + +output "public_route_table_ids" { + description = "List of IDs of public route tables" + value = module.vpc.public_route_table_ids +} + +output "private_route_table_ids" { + description = "List of IDs of private route tables" + value = module.vpc.private_route_table_ids +} + +output "database_route_table_ids" { + description = "List of IDs of database route tables" + value = module.vpc.database_route_table_ids +} + +output "redshift_route_table_ids" { + description = "List of IDs of redshift route tables" + value = module.vpc.redshift_route_table_ids +} + +output "elasticache_route_table_ids" { + description = "List of IDs of elasticache route tables" + value = module.vpc.elasticache_route_table_ids +} + +output "intra_route_table_ids" { + description = "List of IDs of intra route tables" + value = module.vpc.intra_route_table_ids +} + +output "public_internet_gateway_route_id" { + description = "ID of the internet gateway route" + value = module.vpc.public_internet_gateway_route_id +} + +output "public_internet_gateway_ipv6_route_id" { + description = "ID of the IPv6 internet gateway route" + value = module.vpc.public_internet_gateway_ipv6_route_id +} + +output "database_internet_gateway_route_id" { + description = "ID of the database internet gateway route" + value = module.vpc.database_internet_gateway_route_id +} + +output "database_nat_gateway_route_ids" { + description = "List of IDs of the database nat gateway route" + value = module.vpc.database_nat_gateway_route_ids +} + +output "database_ipv6_egress_route_id" { + description = "ID of the database IPv6 egress route" + value = module.vpc.database_ipv6_egress_route_id +} + +output "private_nat_gateway_route_ids" { + description = "List of IDs of the private nat gateway route" + value = module.vpc.private_nat_gateway_route_ids +} + +output "private_ipv6_egress_route_ids" { + description = "List of IDs of the ipv6 egress route" + value = module.vpc.private_ipv6_egress_route_ids +} + +output "private_route_table_association_ids" { + description = "List of IDs of the private route table association" + value = module.vpc.private_route_table_association_ids +} + +output "database_route_table_association_ids" { + description = "List of IDs of the database route table association" + value = module.vpc.database_route_table_association_ids +} + +output "redshift_route_table_association_ids" { + description = "List of IDs of the redshift route table association" + value = module.vpc.redshift_route_table_association_ids +} + +output "redshift_public_route_table_association_ids" { + description = "List of IDs of the public redshidt route table association" + value = module.vpc.redshift_public_route_table_association_ids +} + +output "elasticache_route_table_association_ids" { + description = "List of IDs of the elasticache route table association" + value = module.vpc.elasticache_route_table_association_ids +} + +output "intra_route_table_association_ids" { + description = "List of IDs of the intra route table association" + value = module.vpc.intra_route_table_association_ids +} + +output "public_route_table_association_ids" { + description = "List of IDs of the public route table association" + value = module.vpc.public_route_table_association_ids +} + +output "dhcp_options_id" { + description = "The ID of the DHCP options" + value = module.vpc.dhcp_options_id +} + +output "nat_ids" { + description = "List of allocation ID of Elastic IPs created for AWS NAT Gateway" + value = module.vpc.nat_ids +} + +output "nat_public_ips" { + description = "List of public Elastic IPs created for AWS NAT Gateway" + value = module.vpc.nat_public_ips +} + +output "natgw_ids" { + description = "List of NAT Gateway IDs" + value = module.vpc.natgw_ids +} + +output "igw_id" { + description = "The ID of the Internet Gateway" + value = module.vpc.igw_id +} + +output "igw_arn" { + description = "The ARN of the Internet Gateway" + value = module.vpc.igw_arn +} + +output "egress_only_internet_gateway_id" { + description = "The ID of the egress only Internet Gateway" + value = module.vpc.egress_only_internet_gateway_id +} + +output "cgw_ids" { + description = "List of IDs of Customer Gateway" + value = module.vpc.cgw_ids +} + +output "cgw_arns" { + description = "List of ARNs of Customer Gateway" + value = module.vpc.cgw_arns +} + +output "this_customer_gateway" { + description = "Map of Customer Gateway attributes" + value = module.vpc.this_customer_gateway +} + +output "vgw_id" { + description = "The ID of the VPN Gateway" + value = module.vpc.vgw_id +} + +output "vgw_arn" { + description = "The ARN of the VPN Gateway" + value = module.vpc.vgw_arn +} + +output "default_vpc_id" { + description = "The ID of the Default VPC" + value = module.vpc.default_vpc_id +} + +output "default_vpc_arn" { + description = "The ARN of the Default VPC" + value = module.vpc.default_vpc_arn +} + +output "default_vpc_cidr_block" { + description = "The CIDR block of the Default VPC" + value = module.vpc.default_vpc_cidr_block +} + +output "default_vpc_default_security_group_id" { + description = "The ID of the security group created by default on Default VPC creation" + value = module.vpc.default_vpc_default_security_group_id +} + +output "default_vpc_default_network_acl_id" { + description = "The ID of the default network ACL of the Default VPC" + value = module.vpc.default_vpc_default_network_acl_id +} + +output "default_vpc_default_route_table_id" { + description = "The ID of the default route table of the Default VPC" + value = module.vpc.default_vpc_default_route_table_id +} + +output "default_vpc_instance_tenancy" { + description = "Tenancy of instances spin up within Default VPC" + value = module.vpc.default_vpc_instance_tenancy +} + +output "default_vpc_enable_dns_support" { + description = "Whether or not the Default VPC has DNS support" + value = module.vpc.default_vpc_enable_dns_support +} + +output "default_vpc_enable_dns_hostnames" { + description = "Whether or not the Default VPC has DNS hostname support" + value = module.vpc.default_vpc_enable_dns_hostnames +} + +output "default_vpc_main_route_table_id" { + description = "The ID of the main route table associated with the Default VPC" + value = module.vpc.default_vpc_main_route_table_id +} + +output "public_network_acl_id" { + description = "ID of the public network ACL" + value = module.vpc.public_network_acl_id +} + +output "public_network_acl_arn" { + description = "ARN of the public network ACL" + value = module.vpc.public_network_acl_arn +} + +output "private_network_acl_id" { + description = "ID of the private network ACL" + value = module.vpc.private_network_acl_id +} + +output "private_network_acl_arn" { + description = "ARN of the private network ACL" + value = module.vpc.private_network_acl_arn +} + +output "outpost_network_acl_id" { + description = "ID of the outpost network ACL" + value = module.vpc.outpost_network_acl_id +} + +output "outpost_network_acl_arn" { + description = "ARN of the outpost network ACL" + value = module.vpc.outpost_network_acl_arn +} + +output "intra_network_acl_id" { + description = "ID of the intra network ACL" + value = module.vpc.intra_network_acl_id +} + +output "intra_network_acl_arn" { + description = "ARN of the intra network ACL" + value = module.vpc.intra_network_acl_arn +} + +output "database_network_acl_id" { + description = "ID of the database network ACL" + value = module.vpc.database_network_acl_id +} + +output "database_network_acl_arn" { + description = "ARN of the database network ACL" + value = module.vpc.database_network_acl_arn +} + +output "redshift_network_acl_id" { + description = "ID of the redshift network ACL" + value = module.vpc.redshift_network_acl_id +} + +output "redshift_network_acl_arn" { + description = "ARN of the redshift network ACL" + value = module.vpc.redshift_network_acl_arn +} + +output "elasticache_network_acl_id" { + description = "ID of the elasticache network ACL" + value = module.vpc.elasticache_network_acl_id +} + +output "elasticache_network_acl_arn" { + description = "ARN of the elasticache network ACL" + value = module.vpc.elasticache_network_acl_arn +} + +# VPC flow log +output "vpc_flow_log_id" { + description = "The ID of the Flow Log resource" + value = module.vpc.vpc_flow_log_id +} + +output "vpc_flow_log_destination_arn" { + description = "The ARN of the destination for VPC Flow Logs" + value = module.vpc.vpc_flow_log_destination_arn +} + +output "vpc_flow_log_destination_type" { + description = "The type of the destination for VPC Flow Logs" + value = module.vpc.vpc_flow_log_destination_type +} + +output "vpc_flow_log_cloudwatch_iam_role_arn" { + description = "The ARN of the IAM role used when pushing logs to Cloudwatch log group" + value = module.vpc.vpc_flow_log_cloudwatch_iam_role_arn +} + +# VPC endpoints +#output "vpc_endpoints" { +# description = "Array containing the full resource object and attributes for all endpoints created" +# value = module.vpc_endpoints.endpoints +#} diff --git a/infra-examples/aws/vpc/providers.tf b/infra-examples/aws/vpc/providers.tf new file mode 100644 index 0000000..b67acf4 --- /dev/null +++ b/infra-examples/aws/vpc/providers.tf @@ -0,0 +1,19 @@ +#------------------------------------------------------------------------------ +# written by: Lawrence McDaniel +# https://lawrencemcdaniel.com/ +# +# date: Aug-2022 +# +# usage: all providers for Kubernetes and its sub-systems. The general strategy +# is to manage authentications via aws cli where possible, simply to limit +# the environment requirements in order to get this module to work. +# +# another alternative for each of the providers would be to rely on +# the local kubeconfig file. +#------------------------------------------------------------------------------ + +# Configure the AWS Provider +provider "aws" { + region = var.aws_region +} + diff --git a/infra-examples/aws/vpc/variables.tf b/infra-examples/aws/vpc/variables.tf new file mode 100644 index 0000000..f7dca9e --- /dev/null +++ b/infra-examples/aws/vpc/variables.tf @@ -0,0 +1,85 @@ +#------------------------------------------------------------------------------ +# written by: Miguel Afonso +# https://www.linkedin.com/in/mmafonso/ +# +# date: Aug-2021 +# +# usage: create a VPC to contain all Open edX backend resources. +#------------------------------------------------------------------------------ +variable "aws_region" { + description = "The region in which the origin S3 bucket was created." + type = string + default = "us-east-1" +} + +variable "cidr" { + description = "The CIDR block for the VPC. Default value is a valid CIDR, but not acceptable by AWS and should be overridden" + type = string + default = "192.168.0.0/20" +} + +variable "database_subnets" { + description = "A list of database subnets" + type = list(string) + default = ["192.168.8.0/24", "192.168.9.0/24"] +} + +variable "elasticache_subnets" { + description = "A list of elasticache subnets" + type = list(string) + default = ["192.168.10.0/24", "192.168.11.0/24"] +} + +variable "enable_ipv6" { + description = "Requests an Amazon-provided IPv6 CIDR block with a /56 prefix length for the VPC. You cannot specify the range of IP addresses, or the size of the CIDR block." + type = bool + default = false +} + +variable "enable_nat_gateway" { + description = "Should be true if you want to provision NAT Gateways for each of your private networks" + type = bool + default = true +} + +variable "one_nat_gateway_per_az" { + description = "Should be true if you want only one NAT Gateway per availability zone. Requires var.azs to be set, and the number of public_subnets created to be greater than or equal to the number of availability zones specified in var.azs" + type = bool + default = true +} + +variable "single_nat_gateway" { + description = "Should be true if you want to provision a single shared NAT Gateway across all of your private networks" + type = bool + default = false +} + +variable "enable_dns_hostnames" { + description = "Should be true to enable DNS hostnames in the VPC" + type = bool + default = false +} + +variable "name" { + description = "Name to be used on all the resources as identifier" + type = string + default = "openedx-k8s-harmony" +} + +variable "private_subnets" { + description = "A list of private subnets inside the VPC" + type = list(string) + default = ["192.168.4.0/24", "192.168.5.0/24", "192.168.6.0/24"] +} + +variable "public_subnets" { + description = "A list of public subnets inside the VPC" + type = list(string) + default = ["192.168.1.0/24", "192.168.2.0/24", "192.168.3.0/24"] +} + +variable "tags" { + description = "A map of tags to add to all resources" + type = map(string) + default = {} +} diff --git a/infra-examples/aws/vpc/versions.tf b/infra-examples/aws/vpc/versions.tf new file mode 100644 index 0000000..ac1fc42 --- /dev/null +++ b/infra-examples/aws/vpc/versions.tf @@ -0,0 +1,22 @@ +#------------------------------------------------------------------------------ +# written by: Lawrence McDaniel +# https://lawrencemcdaniel.com/ +# +# date: March-2022 +# +# usage: build an EKS cluster load balancer that uses a Fargate Compute Cluster +#------------------------------------------------------------------------------ +terraform { + required_version = "~> 1.3" + + required_providers { + aws = { + source = "hashicorp/aws" + version = "~> 4.65" + } + local = { + source = "hashicorp/local" + version = "~> 2.4" + } + } +} diff --git a/infra-example/k8s-cluster/main.tf b/infra-examples/digitalocean/k8s-cluster/main.tf similarity index 100% rename from infra-example/k8s-cluster/main.tf rename to infra-examples/digitalocean/k8s-cluster/main.tf diff --git a/infra-example/main.tf b/infra-examples/digitalocean/main.tf similarity index 100% rename from infra-example/main.tf rename to infra-examples/digitalocean/main.tf